How Data Science made Twitter a Top Social Media Channel
From challenging projects to cutting-edge technology twitter use everything to give its users one of a kind experience. It is the back end work of some smart people (you can also call them Data Scientists) who make the elite class, influencers, blogger or even the people around look cool. From Mann Ki Baat with PM to the First Lady of U.S to hilarious tweets of the world cup, the world now doesn’t have to wait for the news to be aired after 24 hours.
P.s-I made sure to keep the introduction within 280 characters.😅
Talking about Data Scientist at Twitter this article will help you understand how Data Science is used at Twitter. How the data scientists are contributing to maintaining the competitive edge of the product. You will also get an insight into the types of Data Scientists at Twitter and also their roles and responsibilities.
How Data Science is used at Twitter?
Data Science is used at Twitter in two different ways as there are two types of data scientists who work differently –
Type A Data Scientists
The A here stands for Analysis. This is a more static approach towards analysis of data or gaining insights from it. The work of a Type A data scientist is more closely related to that of a statistician. A Type A Data Scientist is well versed with data cleaning, working with large data-sets, data visualization, domain knowledge, etc.
Type B Data Scientists
The B here stands Building. While Type B Data Scientists share their background in statistics with Type A Data Scientists, they are well versed in coding and fundamentals of software engineering. They are responsible for building data products that directly interact with the user. This helps them to craft products that provide recommendations and other forms of interactive results to the user.
You must check – How Data Science become the Trump card for Flipkart?
Data Platform at Twitter
There are three types of companies upon which the magnitude of the data platform depends –
- Early Stage Startup
- A Mid-Stage Growing Startup
- Enterprise and Large Scale Companies
While an early stage startup does not require a high data intensive platform like Hadoop, much of it is contributed by the lack of data and cold start. A Mid Level Startup focuses mostly on gaining insights from the data. However, a mature company like Twitter already has a well-developed data platform. There are various requirements in a large scale enterprise like Twitter such as – the need for maintaining the competitive edge, efficiency in logistics, optimization that requires Data Scientists that are skilled at Machine Learning. At Twitter, there are hundreds of Map-reduce jobs that are processed on a daily basis and efficient and reliable ETL processes.
Responsibilities of a Data Scientist at Twitter
The responsibilities of a Data Scientist at Twitter can be categorized into four categories –
1. Developing Insights from the Product
Using Data to discover insights and implementing the insights to better the product is one of the key responsibilities of the data scientist. This data is gathered whenever a user interacts with the device which is ultimately stored in a log file or metadata for further usage.
There are various ways of analyzing this data. The first method is to have a straightforward method of understanding user eligibility through push notifications. The next mode of analysis is the SMS delivery rates across different carriers and finally, analysis of multiple user accounts.
2. Building Data Pipelines
At Twitter, Data pipelines are extensively used. A Data Pipeline allows aggregation of data from multiple sources and makes it easier for the data scientist to perform operations on it.
The analysis that is carried out at Twitter is through these data pipelines. It allows the jobs to be executed automatically and powering of the dashboards to facilitate user consumption of the data.
3. Performing Experimentation (A/B Testing)
Another important role of a Data Scientist at Twitter is to carry out A/B testing. A/B testing is basically a randomized experiment with two variants. It is a form of hypothesis testing through which the company can determine the variant that draws the most users. At Twitter, experimentations like A/B testing are carried out as part of their tool Duck Duck Goose (DDG). It allows the system to aggregate big data that is gathered through millions of tweets, delineate changes in the social graph, make server logs and records of user interactions through web and mobile clients.
Wait! Have you checked what exactly Big Data is?
4. Predictive Modeling
Predictive Modeling and Machine Learning are two of the most important responsibilities of a Data Scientist. Twitter is a data playground. The colossal amount of data can be harnessed through various predictive modeling techniques and machine learning techniques.
With the help of machine learning, data scientists at Twitter are able to reduce the number of spam messages to the users. It also applied advanced deep learning techniques to provide relevant notifications.
In this article, we went through the daily responsibilities of a Data Scientist at Twitter. Being one of the largest companies in the world, Twitter gains insights about the users and provides them with relevant content through these Data Scientists. We also learned about the types of Data Scientists that are hired at Twitter.
Hope after reading the article you are motivated enough to start your career as a Data Scientist at Twitter. If you want more such articles or data science case studies, let us know through comments. We will definitely respond. Here is the next article – Steps that will make your Career in Data Science.
Happy learning 😊