FBLinkScrapper

A python script to download all the links shared in an entire conversation from Facebook.

Outputs the conversation in a JSON format, as well as the JSON for each individual chunk.

Initial Setup

Run for both chat.py and group_chat.py

In Chrome, open facebook.com/messages and open any conversation with a fair number of messages
Open the network tab of the Chrome Developer tools
Scroll up in the conversation until the page attempts to load previous messages
Look for the POST request to thread_info.php
You need to copy certain parameters from this request into the python script to complete the setup:
Set the cookie value to the value you see in Chrome under Request Headers
Set the __user value to the value you see in Chrome under Form Data
Set the __a value to the value you see in Chrome under Form Data
Set the __dyn value to the value you see in Chrome under Form Data
Set the __req value to the value you see in Chrome under Form Data
Set the fb_dtsg value to the value you see in Chrome under Form Data
Set the ttstamp value to the value you see in Chrome under Form Data
Set the __rev value to the value you see in Chrome under Form Data

You're now all set to start downloading messages.

Downloading Messages

Get the conversation ID for those messages by opening http://graph.facebook.com/{username-of-chat-partner}
Copy the id value from there
For group conversations, the ID can be retrieved from the messages tab, as part of the URL. You must use group_chat.py instead.
Run the command python chat.py {id} 2000, and put the value you retrieved for ID earlier

Messages are saved by default to Messages/{id}/

Known Issues

The script sometimes has trouble with very large conversations (>100k messages). Facebook seems to rate limit this, and returns empty responses. In such cases, the script will retry after 30s until it gets a valid response.

It may take the script several tries to get a valid response.

Interrupting the execution before completion only leaves the JSON chunks, not the stitched file.

Script still in developement phase.

REFERENCES

I used the followiing script to retrieve all the messages in a Facebook chat and then filtered out only the links shared in it.......https://github.com/RaghavSood/FBMessageScraper

TODO

LinkBin

A web application that will contain the list of all the links exchanged in all the Facebook Chats and segregate them on the basis of the author's name and displaying the timestamp for each link.

The data will be fetched directly from the JSON file that contains all the links of all the facebook chats and will be binded to a html frontend(the web app in our case).

Further functionalities will be added to this,for eg. options for segregating the links on the basis of timestamp,topics,tags etc..!!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chat.py		chat.py
extract_links.py		extract_links.py
group_chat.py		group_chat.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

chat.py

chat.py

extract_links.py

extract_links.py

group_chat.py

group_chat.py

Repository files navigation

FBLinkScrapper

Initial Setup

Downloading Messages

Known Issues

REFERENCES

TODO

LinkBin

About

Releases

Packages

Languages

License

iCHAIT/FBLinkScraper

Folders and files

Latest commit

History

Repository files navigation

FBLinkScrapper

Initial Setup

Downloading Messages

Known Issues

REFERENCES

TODO

LinkBin

About

Topics

Resources

License

Stars

Watchers

Forks

Languages