Tweety

Uses "Twitter Streaming API" to get the target tweets(real-time) for a recent high traffic event(s), and persisting them to elasticsearch. Later, tweets can be filtered using REST API

Requirements

Python 2.7+, pip, Elastilcsearch, Twitter developer app

Note: For creating Twitter developer app, visit Twitter Application Management page

How to run?

Move to <project-dir>, create virual environment and then activate it as

$ cd <project-dir>
$ virtualenv .environment
$ source .environment/bin/activate

Copy settings_sample.py and create settings.py. Edit configuration/settings related to Twitter developer app.

$ cp settings_sample.py settings.py

Add project to PYTHONPATH as

$ export PYTHONPATH="$PYTHONPATH:." # . corresponds to current directory(project-dir)

If you are using PyCharm then it can be done under run configuration.

Under <project-dir> install requirements/dependencies as

$ pip install -r requirements.txt

Then run app.py as

$ python app.py

Now you can access the application by visiting {protocol}://{host}:{port}. For localhost it is http://localhost:5000.

Congratulations! Start Streaming & later on data can be filtered by using Funneling API.

Schema

Fields: In Elasticsearch, every document tweet under tweets_index will contain following fields -

tweet_text: string,
screen_name : string,
user_name: string,
location: string,
source_device: string,
is_retweeted: boolean,
retweet_count: integer,
country: string,
country_code: string,
reply_count: integer,
favorite_count: integer,
created_at: datetime,
timestamp_ms: long,
lang: string,
hashtags: array

Operators

Operators: Following operators are available in order to filter/query data/tweets -

equals : Facilitates exact match, or = operator for numeric/datetime values.
contains : Facilitates full-text search.
wildcard :
- startswith : *ind (Starts with ind),
- endswith : ind* (Ends with ind),
- wildcard : *ind* (searches ind anywhere in string)
gte : >= operator for numeric/datetime values.
gt : > operator for numeric/datetime values.
lte : <= operator for numeric/datetime values.
lt : < operator for numeric/datetime values.

API's/Endpoints

Streaming

GET /stream?keywords=cricket,hockey,virat

It will start streaming real-time tweets containing kewords. And tweets will get persisted in elasticsearch under the index tweets_index and tweet document type.

Response

{
  "status": "success",
  "message": "Started streaming tweets with keywords [u'cricket', u'hockey', u'virat']"
}

Funneling/Searching

POST /funnel?from=0&size=20

Note: from & size can be used for limit/pagination, but are optional, default size is 100.

Request body

{
	"sort":["created_at"],          		// User '-' sign for 'desc' order.
	"criteria": {
		"AND": [{
			"fields": ["created_at"],	
			"operator": "gte",		// equals, contains, wildcard, gte, gt, lte, lt
			"query": "2017-12-17T14:18:13"
		    }, {
			"fields": ["location"],
			"operator": "wildcard",
			"query": "*ind*"
		    }, {
			"fields": ["hashtags"],		// 'hashtags' is an array field.
			"operator": "contains",
			"query": "Cricket"
		    }
		],
		"OR": [{
			"fields": ["hashtags"],
			"operator": "contains",
			"query": "cricket"
		    }, {
			"fields": ["hashtags"],
			"operator": "contains",
			"query": "hockey"
		    }
		],
		"NOT": [{
			"fields": ["source_device"],
			"operator": "equals",
			"query": "Twitter for Android"
		    }
		]
    	}
}

Response

{
    "count": {
        "total": 21,
        "fetched": 10
    },
    "results": [
        {
            "sort": [
                1513520366000
            ],
            "_type": "tweet",
            "_source": {
                "lang": "in",
                "is_retweeted": false,
                "retweet_count": 0,
                "screen_name": "T10CricketLive",
                "country": "",
                "created_at": "2017-12-17T14:19:26",
                "hashtags": [
                    "IndvSL",
                    "Cricket"
                ],
                "tweet_text": "Ind 193/2 (30 ov), need 23. Karthik 15(24), Dhawan 87(79). Bowling figures of Akila Dananjaya so far: 7-0-48-1. #IndvSL #Cricket",
                "source_device": "IFTTT",
                "reply_count": 0,
                "location": "New Delhi, India",
                "country_code": "",
                "timestamp_ms": "1513520366428",
                "user_name": "cricGuru5167",
                "favorite_count": 0
            },
            "_score": null,
            "_index": "tweets_index",
            "_id": "AWBk2AUVU3yhj98vAeu_"
        },
        {......},
        {......},
        {......},
        {......},
    ]
}

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
service		service
views		views
.gitignore		.gitignore
README.md		README.md
app.py		app.py
pip-selfcheck.json		pip-selfcheck.json
requirements.txt		requirements.txt
settings_sample.py		settings_sample.py
tweety.py		tweety.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

service

service

views

views

.gitignore

.gitignore

README.md

README.md

app.py

app.py

pip-selfcheck.json

pip-selfcheck.json

requirements.txt

requirements.txt

settings_sample.py

settings_sample.py

tweety.py

tweety.py

Repository files navigation

Tweety

Requirements

How to run?

Schema

Operators

API's/Endpoints

Streaming

Funneling/Searching

About

Releases

Packages

Languages

suyash248/tweety

Folders and files

Latest commit

History

Repository files navigation

Tweety

Requirements

How to run?

Schema

Operators

API's/Endpoints

Streaming

Funneling/Searching

About

Topics

Resources

Stars

Watchers

Forks

Languages