Website Detective: Python Flask, MongoDB and Celery

Find out who is viewing your webpage, from where and with what device.

have a simple profile page namieluss.com which I created several years ago, back when I was a studying for my Bachelor’s degree. Every once in a while, I log on to check, make sure things are working well and update the server.

There are many reasons why you should keep your servers up-to-date, ranging from security issues all the way to performance. Nevertheless, I do so on servers I manage every once in a while, even if everything is working well…

“There are two ways to write error-free programs; only the third one works”

Back to my server logs, I see lots of entries that made me curious; Who is checking out a simple, plain website? I needed to know more about my website visitors.

Who accessed my Page?

Image for post
Image for post
Just like a curious kid, photo by Joseph Rosales on Unsplash

I wrote some simple code to help me figure that out… hurrah. I’ll simplify and explain to you how to do the same using a simple app.

Things to be covered:

  1. Creating a simple Flask application
  2. Adding a task using Celery
  3. Performing CRUD operations on MongoDB

I will be using Python Flask, Celery and MongoDB for this project. It is important that you have some basic coding knowledge in order to understand the topics below. Also you need to have Python and MongoDB installed on your device.

Nevertheless, I will make this as simple as possible.

Flask — a python micro web framework

What I like about Flask is the simplicity. It also comes with batteries-included, which helps a long way. You can get things done quickly without having to worry about structural complexity. I’ll use it to run a simple http web server. It’s advised not to use Flask in production. You can switch to Gunicorn or uWSGI instead when deploying your Flask app.

Python Celery — task queue

Celery is an open source asynchronous task queue. Task queues are used as a mechanism to distribute work across threads or machines. Celery requires a message broker to communicate between workers and the Flask app, and to store task results. A common choice RabbitMQ, but you can also use Redis or MongoDB. I chose MongoDB for this demonstration.

MongoDB — document-oriented database

A general purpose, document-based, distributed database built for modern application developers and for the cloud era. I will be using MongoDB to store the access logs and task results.

The Code

The snippets below explains in brief, the app implementation.

Main — app/__init__.py

This comprise of the main part of the app. Some external file such as app/constant.py and app/task.py was created to keep the code clean and for simplicity. Remember DRY & KISS? from software design principles.

KISS: “Keep it Simple and Stupid”,

DRY: Don’t Repeat Yourself

This above snippet contains a basic Flask configuration. Pymongo (MongoDB python library) and Celery are configured to meet the requirements of this app.

# set Celery Broker URL
app.config['CELERY_BROKER_URL'] = MONGODB_CON_STR

app.config['CELERY_RESULT_BACKEND'] = MONGODB_CON_STR

I added three pages which are index page (home page), about page and contact page. You can add as many pages as you want.

Whenever a user accesses the home page (http://mywebsite.com/) or about page (http://mywebsite.com/about), a job is created under celery through the chech_who_and_where() function.

Celery Task — app/task.py

The task.py contains a celery task. The main reason for using celery is to asynchronously run check_who_and_where() function. This will prevent lagging and waiting on the user end.

I fetch and store the details into the database.

I used ipapi.co API to find the location, city, longitude latitude and general details of the user’s external IP address. There are several API’s out there you can utilise for the same purpose. Some of which are paid, others are free or freemium.

You can check ipapi.co to know more about the details it returns of any given external IP addresses.

View Logs

The logs are stored in side MongoDB and can be accessed through mongo shell or running the view_logs() function. You can also use a MongoDB GUI like Robo 3T if you despise command line but I doubt it… haha, you came this far.

# simple data stored
# db.page_access_log.findOne({})
data = {
"browser": "chrome",
"city": "Dallas",
"country_code": "US",
"country_name": "United States",
"date": "Sat, 06 Jun 2020 08:29:28 GMT",
"ip": "ip.ad.dre.ss",
"latitude": 32.8477,
"longitude": -96.7025,
"org": "ZNET",
"page": "profile",
"platform": "windows",
"region": "Texas",
"region_code": "TX"
}

Software engineer, runner, hiker, fitness enthusiast. https://namieluss.com