Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22465

Dataquest: Building An Analytics Data Pipeline In Python

$
0
0

If you’ve ever wanted to work with streaming data, or data that changes quickly, you may be familiar with the concept of a data pipeline. Data pipelines allow you transform data from one representation to another through a series of steps. Data pipelines are a key part of data engineering, which we teach in our new Data Engineer Path.

A common use case for a data pipeline is figuring out information about the visitors to your web site. If you’re familiar with Google Analytics, you know the value of seeing real-time and historical information on visitors. In this blog post, we’ll use data from web server logs to answer questions about our visitors.

If you’re unfamiliar, every time you visit a web page, such as the Dataquest Blog, your browser is sent data from a web server. To host this blog, we use a high-performance web server called Nginx. Here’s how the process of you typing in a URL and seeing a result works:

The process of sending a request from a web browser to a server.

First, the client sends a...


Viewing all articles
Browse latest Browse all 22465

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>