Python Data: Collecting / Storing Tweets with Python and MySQL

A few days ago, I published Collecting / Storing Tweets with Python and MongoDB. In that post, I describe the steps needed to collect and store tweets gathered via the Twitter Streaming API.

I received a comment on that post asking how to store data into MySQL instead of MongoDB. Here’s what you’d need to do to make that change.

Collecting / Storing Tweets with Python and MySQL

In the previous post, we store the twitter data into MongoDB with one line of code:

db.twitter_search.insert(datajson)

Unlike MongoDB where we can insert the entire json object, if we want to use MySQL instead of MongoDB, we need to do some additional work with the ‘datajson’ object before storing it.

Let’s assume that we are interested in just capturing the username, date, Tweet and Tweet ID from twitter. This is most likely the bare minimum information you’d want to capture from the API…there are many (many) more fields available but for now, we’ll go with these four.

Note: I’ll hold off on the MySQL specific changes for now and touch on them shortly.

Once you capture the tweet (line 38 in my original script) and have it stored in your datajson object, create a few variables to store the date, username, Tweet and ID.

text = datajson['text']
screen_name = datajson['user']['screen_name']
tweet_id = datajson['id']
created_at = parser.parse(datajson['created_at'])

Note in the above that we are using the

parser.parse()

command from the

dateutil

module to parse the created_at date for storage into Mysql.

Now that we have our variables ready to go, we’ll need to store those variables into MySQL. Before we can do that, we need to set up a MySQL connection. I use the python-mysql connector but you are free to use what you need to. You’ll need to do a

import MySQLdb

to get the connector imported into your script (and assuming you installed the connector with

pip install mysql-python

You’ll need to create a table to store this data. You can use the sql statement below to do so if you need assistance / guidance.

CREATE TABLE `twitter` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `tweet_id` varchar(250) DEFAULT NULL,
 `screen_name` varchar(128) DEFAULT NULL,
 `created_at` timestamp NULL DEFAULT NULL,
 `text` text,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=56 DEFAULT CHARSET=utf8;

Now, let’s set up our MySQL connection, query and execute/commit for the script. I’ll use a function for this to be able to re-use it for each tweet captured.

def store_data(created_at, text, screen_name, tweet_id):
    db=MySQLdb.connect(host=HOST, user=USER, passwd=PASSWD, db=DATABASE, charset="utf8")
    cursor = db.cursor()
    insert_query = "INSERT INTO twitter (tweet_id, screen_name, created_at, text) VALUES (%s, %s, %s, %s)"
    cursor.execute(insert_query, (tweet_id, screen_name, created_at, text))
    db.commit()
    cursor.close()
    db.close()
    return

That’s it. You are now collecting tweets and storing those tweets into a MySQL database.

Full Script

from __future__ import print_function
import tweepy
import json
import MySQLdb 
from dateutil import parser

WORDS = ['#bigdata', '#AI', '#datascience', '#machinelearning', '#ml', '#iot']

CONSUMER_KEY = "KEY"
CONSUMER_SECRET = "SECRET"
ACCESS_TOKEN = "TOKEN"
ACCESS_TOKEN_SECRET = "TOKEN_SECRET"

HOST = "YOUR_DATABASE_HOST"
USER = "YOUR_DATABASE_USER"
PASSWD = "YOUR_DATABASE_PASSWORD"
DATABASE = "YOUR_DATABASE"

# This function takes the 'created_at', 'text', 'screen_name' and 'tweet_id' and stores it
# into a MySQL database
def store_data(created_at, text, screen_name, tweet_id):
    db=MySQLdb.connect(host=HOST, user=USER, passwd=PASSWD, db=DATABASE, charset="utf8")
    cursor = db.cursor()
    insert_query = "INSERT INTO twitter (tweet_id, screen_name, created_at, text) VALUES (%s, %s, %s, %s)"
    cursor.execute(insert_query, (tweet_id, screen_name, created_at, text))
    db.commit()
    cursor.close()
    db.close()
    return

class StreamListener(tweepy.StreamListener):    
    #This is a class provided by tweepy to access the Twitter Streaming API. 

    def on_connect(self):
        # Called initially to connect to the Streaming API
        print("You are now connected to the streaming API.")
 
    def on_error(self, status_code):
        # On error - if an error occurs, display the error / status code
        print('An Error has occured: ' + repr(status_code))
        return False
 
    def on_data(self, data):
        #This is the meat of the script...it connects to your mongoDB and stores the tweet
        try:
           # Decode the JSON from Twitter
            datajson = json.loads(data)
            
            #grab the wanted data from the Tweet
            text = datajson['text']
            screen_name = datajson['user']['screen_name']
            tweet_id = datajson['id']
            created_at = parser.parse(datajson['created_at']) 

            #print out a message to the screen that we have collected a tweet
            print("Tweet collected at " + str(created_at))
            
            #insert the data into the MySQL database
            store_data(created_at, text, screen_name, tweet_id)
        
       except Exception as e:
           print(e)

auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
#Set up the listener. The 'wait_on_rate_limit=True' is needed to help with Twitter API rate limiting.
listener = StreamListener(api=tweepy.API(wait_on_rate_limit=True)) 
streamer = tweepy.Stream(auth=auth, listener=listener)
print("Tracking: " + str(WORDS))
streamer.filter(track=WORDS)

Want more information on working with MySQL and Python? Check out the book titled MySQL for Python by Albert Lukaszewski.

The post Collecting / Storing Tweets with Python and MySQL appeared first on Python Data.

Python Data: Collecting / Storing Tweets with Python and MySQL

Collecting / Storing Tweets with Python and MySQL

Full Script

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112