Skip to content

TikTok video scraping and multimodal content analysis tool.

License

Notifications You must be signed in to change notification settings

kariemoorman/tiktok-analyzer

Repository files navigation

TikTok-Analyzer: A TikTok Video Scraping and Content Analysis Tool

Tiktok-Teller


GitHub last commit GitHub code size in bytes GitHub license GitHub stars GitHub stars

Description

Search & download Tiktok videos by username and/or video tag, and analyze video contents. Transcribe video speech to text and perform NLP analysis tasks (e.g., keyword and topic discovery; emotion/sentiment analysis). Isolate audio signal and perform signal processing analysis tasks (e.g., pitch, prosody and sentiment analysis). Isolate visual stream and perform image tasks (e.g., object detection, face detection).


Python Toolkit


Installation & Use

Virtual Environment
  • Clone or download .zip of tiktok-analyzer python package.
git clone https://github.com/kariemoorman/tiktok-analyzer.git
  • Create a virtual environment inside the tiktok-analyzer directory.
cd tiktok-analyzer && python3 -m venv .venv 
  • Activate virtual environment.
source .venv/bin/activate
  • Install package dependencies.
pip install -r requirements.txt

- Install Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

- Install ffmpeg
brew install ffmpeg

- Install face-recognition
cd face_recognition-1.3.0
python setup.py install 
  • Execute tiktok-analyzer program.
python src/tiktok-analyzer.py
Docker Image
  • Clone or download .zip of tiktok-analyzer python package.
git clone https://github.com/kariemoorman/tiktok-analyzer.git
  • Build Docker image.
docker build -t tt-analyzer .
  • Run Docker image as container.
docker run --rm -ti tt-analyzer

Repository Contents

TikTok Video Scrapers
  • tiktok_user_video_scraper.py
    Choose either Selenium or Pyppeteer to dynamically scrape TikTok videos for one or more Tiktok usernames.
    E.g., python3 tiktok_user_video_scraper.py <username> <username> -b pyppeteer -o csv

  • tiktok_tag_video_scraper.py
    Choose either Selenium or Pyppeteer to dynamically scrape TikTok videos for one or more Tiktok tags.
    E.g., python3 tiktok_tag_video_scraper.py physics lhc -b pyppeteer -o csv

  • tiktok_video_metadata_scraper.py
    Export metadata from a Tiktok video.
    E.g., python3 tiktok_video_metadata_scraper.py <tiktok_video_url>


TikTok Video Downloaders
  • tiktok_downloader.py
    Choose either Selenium or Pyppeteer to dynamically download one or more Tiktok videos.
    E.g., python3 tiktok_downloader.py <tiktok_video_url> -b selenium -d firefox

Speech Transcription
  • tiktok_video_to_text.py
    Choose either Google or OpenAI ASR model to transcribe Tiktok video (in mp4 format).
    E.g., python3 tiktok_video_to_text.py <path/to/video_filename.mp4> -m openai

Face & Object Detection
  • face_detection.py
    Conduct face detection task on Tiktok video (in mp4 format).
    E.g., python3 face_detection.py <video.mp4>

  • object_detection.py
    Conduct object detection task on Tiktok video (in mp4 format). E.g., python3 object_detection.py <video.mp4>


NLP Analysis
  • sentiment_analysis.py
    Conduct sentiment analysis tasks on Tiktok video transcription data.
    E.g., python3 sentiment_analysis.py -t <document> -f 'output/file/path/filename.mp4/json'


Use Cases

Option 1: Download a Tiktok Video.

Option 2: Transcribe a Tiktok Video.

Option 3: Analyze a Tiktok Video.


Example Use Case: Analyze Video (Face Detection & NLP)

example_gif

Text:  How you can check up a URL is safe. Go to Google and type transparency report. You can enter. Click this first one. We're going to go to Google safe browsing. Site status. And here you can enter the URL. Stay safe, follow for more.

Tokens: ['how', 'you', 'can', 'check', 'up', 'a', 'url', 'is', 'safe', 'go', 'to', 'google', 'and', 'type', 'transparency', 'report', 'you', 'can', 'enter', 'click', 'this', 'first', 'one', 'we', 'going', 'to', 'go', 'to', 'google', 'safe', 'browsing', 'site', 'status', 'and', 'here', 'you', 'can', 'enter', 'the', 'url', 'stay', 'safe', 'follow', 'for', 'more']

Lemmas: ['how', 'you', 'can', 'check', 'up', 'a', 'url', 'be', 'safe', 'go', 'to', 'google', 'and', 'type', 'transparency', 'report', 'you', 'can', 'enter', 'click', 'this', 'first', 'one', 'we', 'go', 'to', 'go', 'to', 'google', 'safe', 'browsing', 'site', 'status', 'and', 'here', 'you', 'can', 'enter', 'the', 'url', 'stay', 'safe', 'follow', 'for', 'more']

Determiners (Dets): ['a', 'this', 'the']

Nouns: ['url', 'type', 'transparency', 'report', 'browsing', 'site', 'status', 'url']

Verbs: ['check', 'go', 'enter', 'click', 'go', 'go', 'google', 'enter', 'stay', 'follow']

Adjectives (Adjs): ['safe', 'first', 'safe', 'safe', 'more']

Adverbs (Advs): ['here']

Noun Phrases: ['you', 'a url', 'google and type transparency report', 'you', 'we', 'safe browsing', 'site status', 'you', 'the url']

Prepositional Phrases: ['to go', 'for follow']

Verb Phrases: ['check', 'go', 'enter', 'click', 'going go google', 'go google', 'google', 'here enter', 'stay follow', 'follow']

Emotion: Words: ['safe', 'safe', 'safe']

Sentence:  How you can check up a URL is safe.
Sentiment Score: 0.4404
Has Emotion: True
Is Derogatory: False
Derogatory Score: 0.00
Polarity: 0.5
Subjectivity: 0.5
Emotion_words: [(['safe'], 0.5, 0.5, None)]
------------------------------

Sentence: Go to Google and type transparency report.
Sentiment Score: 0.0
Has Emotion: False
Is Derogatory: False
Derogatory Score: 0.00
Polarity: 0.0
Subjectivity: 0.0
Emotion_words: []
------------------------------

Sentence: You can enter.
Sentiment Score: 0.0
Has Emotion: False
Is Derogatory: False
Derogatory Score: 0.00
Polarity: 0.0
Subjectivity: 0.0
Emotion_words: []
------------------------------

Sentence: Click this first one.
Sentiment Score: 0.0
Has Emotion: False
Is Derogatory: False
Derogatory Score: 0.00
Polarity: 0.25
Subjectivity: 0.3333333333333333
Emotion_words: [(['first'], 0.25, 0.3333333333333333, None)]
------------------------------

Sentence: We're going to go to Google safe browsing.
Sentiment Score: 0.4404
Has Emotion: True
Is Derogatory: False
Derogatory Score: 0.00
Polarity: 0.5
Subjectivity: 0.5
Emotion_words: [(['safe'], 0.5, 0.5, None)]
------------------------------

Sentence: Site status.
Sentiment Score: 0.0
Has Emotion: False
Is Derogatory: False
Derogatory Score: 0.00
Polarity: 0.0
Subjectivity: 0.0
Emotion_words: []
------------------------------

Sentence: And here you can enter the URL.
Sentiment Score: 0.0
Has Emotion: False
Is Derogatory: False
Derogatory Score: 0.00
Polarity: 0.0
Subjectivity: 0.0
Emotion_words: []
------------------------------

Sentence: Stay safe, follow for more.
Sentiment Score: 0.4404
Has Emotion: True
Is Derogatory: False
Derogatory Score: 0.00
Polarity: 0.5
Subjectivity: 0.5
Emotion_words: [(['safe'], 0.5, 0.5, None), (['more'], 0.5, 0.5, None)]
------------------------------

License: GNU General Public License v3.0