Python Tweet Deleter.

How to automate tweet deletion with python.

Alberta Odamea Anim-Ayeko
Towards Data Science

--

This article assumes you already have some fundamental python programming knowledge.

If you have a couple of thousand tweets on your twitter account and you’re trying to delete some from years ago, without signing up for a tweet deleter software and without sharing your twitter information with anyone else, this post is for you. It’s going to be a long one, but once you follow along you’ll be fine. You might want to delete your tweets for a couple of reasons. It could be because your old tweets are embarrassing or you want to let go of ignorant opinions from your past. Whatever the reason, this python tutorial is definitely going to be worth it.

I started out with this article written by Kris Shaffer close to four years ago, but got stuck when I realized the twitter data I had downloaded was in a different format from his. I then headed to google so that I didn’t have to reinvent the wheel. The resources I found were mostly source codes on GitHub so I tried them out and fixed some bugs as well, obviously.

(Left) Photo by Hitesh Choudhary on Unsplash | (Right) Photo by MORAN on Unsplash

Your personal twitter archive and developer account.

Before we proceed to the main tutorial about how to get your tweets deleted, you would have to sign up for a twitter developer account and download you twitter archive. Follow this link to download your twitter archive. After downloading your .zip file, unzip it and extract a file with the name ‘tweet.js’ located in the ‘data’ folder.

You’ll need access to the Twitter API, click this and then fill the form. It’s pretty simple. After completing it, you’ll be given your keys and tokens (consumer_key, consumer_secret, access_token, access_token_secret) which are vital for the code below to run successfully. Also, don’t share your keys with anyone. Be genuine with this step otherwise your account could get blocked by twitter.

Getting your environment ready.

This process makes use of the python programming language, but you don’t need to be a guru to follow the steps in this tutorial. I’m going to walk you through how to freshly install python and all the modules used…

This is done on a Windows OS, but if you run a Mac OS or a Linux distro you can definitely follow on your systems as well.
If you already have python on your system, skip step 1.

1. Python installation.
Start a windows command prompt (cmd) and run the command python
If you see a python version(eg: Python 3.7.3) in your terminal, you’re good to go. Else, check out this video to get python installed on your computer.
(see what I did there? for step one?👀)

2. Creating a virtual environment to install all modules used.
First of all, create a folder, on your desktop called ‘tweet deleter’ where you can have all things related. Also, save your ‘tweet.js’ file to your ‘tweet deleter’ folder. Then, move into the directory where the venv* should be created(preferably in your ‘tweet deleter’ folder). Use the command cd(which stands for change directory) + folder name until you enter your ‘tweetdeleter’ folder. Lastly, run this command to create the tweetdeleter venv:
python -m venv tweetdeleter

3. Activating your venv
Run the command tweetdeleter\Scripts\activate.bat

Photo by author. (Creating and activating the venv)

4. Installing the packages in your ‘tweetdeleter’ venv
The following commands must be run in your terminal after your venv has been activated.
pip install tweepy
pip install datetime
pip install jupyter notebook


Jupyter notebook is an IDE*(my go-to IDE), where you can run your code. If you already have one that you use, that’s great! You don’t need to install a new one then. It has a pretty simple interface, but I love it because it’s great for exploring and visualizing data. To check if your packages have truly been installed, run the command ‘pip freeze’ or ‘pip list’ in your cmd terminal.

Photo by author. (Result of running the pip freeze command)

5. Moving on to the point where we use the code and where you see results.
a. Open your jupyter notebook by running the command jupyter notebook in your venv in cmd.

The libraries used.

Tweepy: Python module used to access the Twitter API(which has so many functions that you can play around with, like ‘destroy_status’, ‘destroy_friendship’, ‘update_status’ and many others). Tweepy is great for automating twitter processes and this story is a great example of that.

Datetime: Python module which helps with the manipulation of dates and time. Objects used are ‘datetime’ and ‘time’ of which differences and additions can be made. Datetime has other useful classes such as timedelta and datetime (which are seen in the script below).

Json: In-built python module used for dealing with JSON(JavaScript object notation) data. I like to see jsons as a list of python dictionaries.

N.B.

  • Tweet deletion using the scripts below are done at approximately 1tweet/sec, so you might want to have a crack at this tutorial at your free time.
  • This story is not only being written for avid coders or people with coding experience but for people just starting out as well, so don’t be mad when you read explanations for the most basic of things :)
  • Before copying and pasting the code below into your jupyter notebook, open your ‘tweet.js’ file, delete these characters at the start of the file (window.YTD.tweet.part0 =) and then save it. This is essential because we’re working with only two data structures(dicts* and a list) and hence no other additions are needed.

Tweet Deleter Scripts

The following python scripts will do all the hard work for you once you run it, so you can continue that task you never finished. You need to keep an eye on it though, so that you google any errors and check the progress you’re making. There are three options to choose from, seen below…

1. Deleting all tweets before a certain date

The script

import datetime, tweepy, json  #Block1
from datetime import datetime, timedelta, timezone
consumer_key = ‘XXXX-XXXX-XXXX’ #Block2
consumer_secret = ‘XXXX-XXXX-XXXX’
access_token = ‘XXXX-XXXX-XXXX’
access_token_secret = 'XXXX-XXXX-XXXX’
auth = tweepy.OAuthHandler(consumer_key, consumer_secret) #Block3
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth) #block3
cutoff_date = datetime.now(timezone.utc) — timedelta(days = 1711)
print(cutoff_date) #Block4
fp = open(“C:/Users/ALBERTA ANIM-AYEKO/Desktop /info/tweet.js”,”r”, encoding=’UTF-8')
myjson = json.load(fp) #Block 5
trash_tweet_ids = [] #Block6
for tweet in myjson:
d = datetime.strptime(tweet[‘tweet’][“created_at”], “%a %b %d %H:%M:%S %z %Y”)
if d < cutoff_date:
trash_tweet_ids.append(tweet[‘tweet’][‘id_str’])
api.destroy_status(tweet[‘tweet’][‘id_str’])
print(tweet[‘tweet’][“created_at”] + “ “ + tweet[‘tweet’] [‘id_str’] + ‘ deleted successfully’)
print(str(len(trash_tweet_ids)) + ‘ trash tweets have been deleted…’)

Explanation

This script has 6 blocks of code…
#Block1: Importing modules needed for the script to be run successfully.

#Block2: Inputting your keys and tokens which will help you gain access to the Twitter API.

#Block3: Using tweepy and the necessary keys to get the Twitter API authentication.

#Block4: Creating the ‘cutoff_date’ variable
cutoff_date will be the variable where we’ll store the date before which you want all tweets to be deleted.
The code for the variable moves back (because of the subtraction sign, ‘-’) a number of days you specify to a particular date in the past. You would have to chose a date, for which all tweets before it, should be deleted. That date would be calculated by doing this: The current date — the number of days. Here’s an example. It’s 7th November, 2020 today and for a cutoff date of 1st January, 2018, the ‘days’ value to put into the ‘datetime.delta’ function should be 1041. So you decide what you want that to be…

Creating the cutoff_date variable.

#Block5
Open the tweet.js file by calling the in-built python function, ‘open’, and parsing it into a python dictionary using the json.loads() function. The dict is saved in the variable ‘myjson’.

#Block6
Empty list is assigned to variable, ‘trash_tweet_ids’. A ‘for loop’ is used to go through the tweet data, which is stored in ‘myjson’ variable, pick out the created_at key which has corresponding value of the format %a %b %d %H:%M:%S %z %Y but stored as a string.

a: Day eg. Thursday
b: Month
d: Day of the month
H: Hour
M: Minute
S: Second
z: Offset from UTC time. +0000 means time is exactly UTC
Y: Year
datetime.strptime() basically converts a string to a datetime object and store it to a variable, d, and then compared to the cutoff_date(if d < cutoff_date:). This is done because objects of different formats can’t be compared.

2. Deleting tweets between a time frame

The script

import tweepy, json  #Block1
from datetime import datetime, timedelta, timezone
consumer_key = ‘XXXX-XXXX-XXXX’ #Block2
consumer_secret = ‘XXXX-XXXX-XXXX’
access_token = ‘XXXX-XXXX-XXXX’
access_token_secret = 'XXXX-XXXX-XXXX’
end_date = datetime.now(timezone.utc) — timedelta(days = 615)
print(‘end_date = ‘, end_date)
start_date = datetime.now(timezone.utc) — timedelta(days = 1711)
print(‘start_date = ‘, start_date) #Block3
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth) #Block4
fp = open(“C:/Users/ALBERTA ANIM-AYEKO/Desktop/info/tweet.js”,”r”, encoding=’UTF-8')
myjson = json.load(fp) #Block5
trash_tweet_ids = [] #Block6
for tweet in myjson:
d = datetime.strptime(tweet[‘tweet’][“created_at”], “%a %b %d %H:%M:%S %z %Y”)
if d > start_date and d < end_date:
trash_tweet_ids.append(tweet[‘tweet’][‘id_str’])
try:
api.destroy_status(tweet[‘tweet’][‘id_str’])
print(tweet[‘tweet’][“created_at”] + “ “ + tweet[‘tweet’][‘id_str’] + ‘ deleted successfully’)
except Exception as e:
print(“There was an exception: ”, e)
pass

Explanation

There are 6 blocks of code for the second option but only blocks 3 and 6 haven’t been explained above…
#Block3
For this script two dates are needed(start_date and end_date) so that all tweets between those dates can be deleted.

#Block6
Empty list is assigned to variable, ‘trash_tweet_ids’. A ‘for loop’ is used to iterate twitter data in ‘myjson’ variable, pick out the ‘created_at’ key, convert it to a datetime object and store it to a variable, d. For every dict in ‘myjson’, if d is found between start and end dates, it’s corresponding id is added to ‘trash_tweet_ids’.
Try and Except: Basically, the try and except blocks let you try out some code and handle an error respectively. The 7th block runs smoothly and prints the success message, but when it doesn’t, the except block catches the exception and outputs it so you know what what error you’re dealing with.

Success and exception messages.

3. Using the slicing of lists to delete tweets

The script

import tweepy
consumer_key = ‘XXXX-XXXX-XXXX’
consumer_secret = ‘XXXX-XXXX-XXXX’
access_token = ‘XXXX-XXXX-XXXX’
access_token_secret = XXXX-XXXX-XXXX’
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
for tweet in myjson[-250:]: #taking the bottom 250 tweets
try:
api.destroy_status(tweet['tweet']['id_str'])
print('destroyed successfully')

except Exception as e:
print('There was an exception, destroy unsuccessful. ', e)

Explanation

The ‘myjson’ file that we’ve been dealing with is a list of dicts. This means that it can be sliced. Slicing a list is a way of accessing elements in the list using indexes. By slicing, you can pick out certain portions of the tweet file and then proceed to delete them. In the code above, the slicing is done on the last 250 tweets which is written as ‘250:’ in terms of indexes. A ‘for loop’ is used here again to go through all bottom 250 tweets, and delete each of them.

  • *IDE- Integrated development environment
  • * venv: same as virtual environment, and dict: same as dictionary(but I’m sure you figured that out 👀)
  • In case you take a break or continue this tutorial another time, you can always start your venv again by navigating to your ‘tweetdeleter’ folder in cmd and running the command ‘tweetdeleter\Scripts\activate.bat’
  • Check this out for more api functions you can play around with.
  • Remember to google any errors you may face because there’s so much help on the internet.

I hope you find other interesting api functions and write more cool twitter automation scripts. Thanks for reading!

--

--