
Last Update: April 13, 2025
BYeric
Keywords
How to Scrape Twitter (X) User's Tweets Using tyo-crawler
Scraping tweets from X (formerly known as Twitter) has become significantly more difficult in recent years, especially after the company's acquisition and subsequent rebranding. Once known for its relatively open developer ecosystem, X has shifted toward a more closed, monetized model. The public Twitter API, which previously allowed developers and researchers to access tweet data with manageable rate limits and at little to no cost, has been deprecated. In its place, the official X API now comes with restrictive access tiers and steep pricing, making it nearly inaccessible for small developers, independent researchers, and hobbyists.
This change has created a real bottleneck for those who still need access to tweet data — whether for sentiment analysis, social media monitoring, public opinion research, or competitive intelligence. As a result, many have turned to alternative tools and strategies to extract information directly from the site. However, scraping tweets comes with its own set of technical challenges, including anti-bot protections, rate limiting, and the need to mimic human-like behavior to avoid detection.
tyo-crawler
is a general web scraping tool that can crawl web pages and extract data from them. It is designed to be flexible, allowing users to customize their scraping tasks according to their needs. In this article, we will explore how to use tyo-crawler
to scrape tweets from X.
Prerequisites
Before we begin, make sure you have the following prerequisites:
-
Node.js 16+:
tyo-crawler
is a Node.js-based tool. -
npm (or yarn): Node.js package manager.
-
git: You need to have Git installed on your machine to clone the repository.
-
tyo-crawler: Install it using git:
bashgit clone https://github.com/e-tang/tyo-crawler.git cd tyo-crawler npm install
-
Redis: Ensure you have Redis installed and running. You can install it directly or use the provided Docker Compose setup:
bash# Using Docker Compose (recommended) cd docker docker-compose up -d # Or, install Redis directly (example for Ubuntu) # sudo apt-get update # sudo apt-get install redis-server # sudo systemctl start redis-server
-
A Twitter (X) account: You will need an account to scrape tweets. As without an account, you will not be able to access all the tweets unless you simply want to take what is offered.
-
(Optional) Basic understanding of JSON: You need to know how to edit a JSON file.
Step-by-Step Guide
Here's how to use tyo-crawler
's X processor to scrape tweets from a specific X user's profile:
Step 1: Configure the Actions File (x.json
)
tyo-crawler
uses an actions file to define the processes of interacting with websites that require authentications. The repository provides a pre-configured actions file specifically for X, located at examples/x.example.json
. You should copy this file to the root directory of the tyo-crawler
project and rename it to x.json
.
cp examples/x.examples.json x.json
Important: Do not modify the original x.example.json
file in the examples
directory. Always work with a copy.
Here's the content of x.example.json
(which you should copy to x.json
):
[
{
"if": "a[href*=login]",
"then": [
{
"action": "click",
"selector": "a[href*=login]"
},
{
"action": "wait",
"time": 2000
},
{
"action": "wait",
"selector": "input[autocomplete='username']"
},
{
"action": "type",
"selector": "input[autocomplete='username']",
"value": "YOUR_X_USERNAME"
},
{
"action": "click",
"selector": "button[role='button'] span span",
"text": "Next"
},
{
"action": "wait",
"time": 2000
},
{
"action": "type",
"selector": "input[name='password']",
"value": "YOUR_X_PASSWORD"
},
{
"action": "click",
"selector": "button[data-testid='LoginForm_Login_Button']"
},
{
"action": "wait",
"time": 80000
},
{
"action": "saveCookies"
}
]
},
{
"repeat": [
{
"action": "scroll",
"value": 50
},
{
"action": "evaluate"
},
{
"action": "process"
},
{
"action": "wait",
"time": 1000
}
],
"failed_limit": 4
}
]
Explanation
Actions
An array of actions to be performed.
Login Sequence (if: "a[href*=login]"
)
This part handles the login process.
click
: Clicks on the login link.wait
: Pauses execution for a specified time (in milliseconds) or until an element is present.time: 2000
: Waits for 2 seconds.selector: "input[autocomplete='username']"
: Waits until the username input field is present.
type
: Enters text into an input field.selector: "input[autocomplete='username']"
: The CSS selector for the username input field.value: "YOUR_X_USERNAME"
: ReplaceYOUR_X_USERNAME
with your actual X username.
click
: Clicks the "Next" button.selector: "button[role='button'] span span"
: The CSS selector for the next button.text: "Next"
: The text content of the button.
type
: Enters the password.selector: "input[name='password']"
: The CSS selector for the password input field.value: "YOUR_X_PASSWORD"
: ReplaceYOUR_X_PASSWORD
with your actual X password.
click
: Clicks the login button.wait
time: 80000
: Pauses execution for 80 seconds.
Purpose: This is a crucial step. It provides enough time for you to manually enter a two-factor authentication (2FA) code if you have it enabled on your X account. Even if you don't have 2FA, this wait time allows X's website to fully load after login and helps avoid triggering anti-bot measures.
saveCookies
: Saves the cookies after login. This is important to maintain the login session for subsequent actions.
Repeat Sequence
This part handles the scrolling and data processing.
repeat
: This key contains an array of actions to be repeated.scroll
: Scrolls the page.value: 50
: Scrolls down 50px.
evaluate
: Evaluate the page.process
: Process the page.wait
: Waits for 1 second.failed_limit: 4
: If the loop fails 4 times, stop the loop.
Step 2: Edit the Actions File
Replace YOUR_X_USERNAME
and YOUR_X_PASSWORD
with your actual X username and password. This is crucial for the login process to work.
Step 3: Run the Crawler tyo-crawler
with the X Processor
Now, you can run tyo-crawler
using the X processor and the x.json
actions file.
Open your terminal, navigate to the tyo-crawler
directory, and run the following command:
node index.js --show-window true --with-cookies true --actions-file ./x.json --processor x https://x.com/[ACCOUNT_NAME]
For example, to scrape tweets from the user CommSec
, you would run:
node index.js --show-window true --with-cookies true --actions-file ./x.json --processor x https://x.com/CommSec
Explanation of the Command
node index.js
: Executes the main tyo-crawler script.--show-window true
: Shows the browser window during the scraping process (useful for debugging and entering 2FA codes).--with-cookies true
: Enables cookie handling, which is essential for maintaining the login session.--actions-file ./x.json
: Specifies the path to the actions file.--processor x
: Tells tyo-crawler to use the built-in X processor.https://x.com/CommSec
: The target URL (replaceCommSec
with the desired user's profile).
Step 4: Observe the Results
tyo-crawler
will now:
- Launch a browser.
- Navigate to
https://x.com/CommSec
(or your specified URL). - Attempt to log in using the credentials in
x.json
. - Wait for 80 seconds to allow for 2FA or website loading.
- Scroll down the page 50px each time repeatly.
- Scrape the tweets.
- Save the scraped data.
The scraped data will be saved in the [ACCOUNT_NAME]
directory by default.
Step 5: Adapt and Enhance
- Adjust Scrolling: Modify the value in the
scroll
action to scrape more tweets as this value decides how far we try to scroll the browser to. - Explore Other Actions: Refer to the
README.md
file for other available actions and parameters. - Error Handling: While the X processor handles many common issues, you might still encounter errors. Monitor the console output for any error messages.
- Change the
failed_limit
: Adjust thefailed_limit
in the actions file to control how many times the crawler will attempt to scroll before stopping. This can help prevent infinite loops in case of unexpected errors.
Important Notes
- X's Terms of Service: Be aware of X's terms of service regarding scraping. Scraping may be against their rules, and they may block your IP address if they detect suspicious activity.
- Ethical Scraping: Be respectful of the website's resources. Don't overload their servers with requests.
- Website Changes: X's website structure can change at any time. While the X processor is designed to handle common changes, you might need to update the actions file or the processor itself if major changes occur.
- Anti-bot measures: X has strong anti-bot measures, so you may need to use proxies, rotate user agents, and add delays to avoid being blocked.
- Login: You need to provide your X username and password in the x.json file.
Conclusion
This tutorial demonstrates how to use tyo-crawler's powerful X processor and an actions file to scrape tweets from X efficiently. By leveraging these built-in features, you can avoid writing complex scraping scripts and focus on extracting the data you need. Remember to always be mindful of ethical and legal considerations when web scraping.
Previous Article

Apr 15, 2025
Building an AI Knowledge Base: Using tyo-crawler to Convert Evernote Notes into Markdown
Learn how to transform your Evernote notes into a structured Markdown knowledge base using tyo-crawler, making them ready for integration with Retrieval-Augmented Generation (RAG) systems.
Next Article

Mar 27, 2025
Unleash the Power of Puppeteer: A Deep Dive into the TYO-Crawler Web Crawler
In this blog post, we explore `tyo-crawler`, a Node.js web crawler built on Puppeteer. Learn how to harness its features for efficient web scraping.