Skip to content

Commit 760fb4c

Browse files
committed
2 parents 96831b8 + 9abbeb4 commit 760fb4c

File tree

1 file changed

+258
-0
lines changed

1 file changed

+258
-0
lines changed

docs/Production_ready_database.md

Lines changed: 258 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,258 @@
1+
## Executive Summary
2+
3+
This project enhances Threading Labs’ existing Django-based video game catalog application by implementing advanced PostgreSQL database administration techniques to make the system production ready. The primary objective was to introduce robust data control, auditing, performance tuning, and disaster recovery capabilities.
4+
5+
Key achievements include the design and enforcement of least-privilege access through custom PostgreSQL roles, so that each role has only the permissions necessary to perform its own responsibilities. Real-time audit logging was implemented via PL/SQL triggers, allowing the system to capture and store modifications to sensitive tables in a dedicated audit schema.
6+
7+
To support business intelligence and reporting, analytics-ready materialized views were developed. These views allow for efficient querying of aggregated data without placing additional load on transactional tables. Query performance was improved by identifying bottlenecks using and adding indexes on critical columns that were frequently accessed by the application.
8+
9+
An automated backup and restore mechanism was established, with backups scheduled via Windows Task Scheduler. Restore validation is performed in an isolated Docker container to confirm the integrity and completeness of backup files without affecting the production system.
10+
11+
This work not only improves data security and visibility but also supports analytics through a reporting module and ensures a resilient backup strategy.
12+
13+
Table of Contents
14+
-----------------
15+
- [Architecture](#architecture)
16+
- [Schemas](#schemas)
17+
- [Role Permissions](#role-permissions)
18+
- [SQL](#sql)
19+
- [Backup/Restore Process](#backuprestore-step-by-step)
20+
- [How The Components Work Together](#how-the-components-work-together)
21+
22+
## Architecture
23+
-----------------
24+
The system is built around a Django Web Application composed of:
25+
26+
- **App Module**: Manages transactional operations via the app_writer role.
27+
- **Reporting Module**: Queries optimized materialized views via the app_reader role.
28+
- **Audit Module**: Captures and records DML events into an audit table using the auditor role.
29+
- **Backup Scheduler**: Periodically dumps the entire database using the backup_user role and stores it in external backup storage (the folder in the host machine mounted into the docker container).
30+
31+
The PostgreSQL database is divided into:
32+
- Public schema for main application data
33+
- Analytics schema for materialized views and reporting
34+
- Audit schema for audit logging
35+
36+
Each role is scoped to its own module for a strong security boundary.
37+
38+
## Schemas
39+
--------
40+
### `public` schema
41+
42+
![img_3.png](img_3.png)
43+
44+
This is the main transactional schema used by the Django application. It includes all core entities required for managing the video game catalog and the user interactions.
45+
46+
It contains tables such as:
47+
48+
- videogame table: Stores all videogame records.
49+
- genre, review, copy, userprofile, developer tables: Supporting entities linked to games and users.
50+
- Standard Django authentication and permission tables: auth_user, auth_group, auth_permission, django_session
51+
52+
Foreign keys for referential integrity between users, games, genres, and other related entities. This schema supports full CRUD operations required by the web application.
53+
54+
### `audit` schema
55+
56+
![img_5.png](img_5.png)
57+
58+
The audit schema is designed to capture all critical data changes for compliance and traceability.
59+
60+
It includes an audit_log table which includes all the CRUD operations done on the public schema.
61+
62+
Triggers are attached to all critical tables:
63+
- videogame
64+
- genre
65+
- auth_user
66+
- review
67+
- developer
68+
- copy
69+
- userprofile
70+
71+
Each after trigger writes a record to the `audit.audit_log` table for insert, update, or delete actions, capturing the table name, operation, changed data, timestamp, and user.
72+
73+
The `auditor` role has only read access to this schema, and `app_writer` is granted with insert permissions that are necessary for logging.
74+
75+
76+
### `Analytics` Schema
77+
78+
![img_4.png](img_4.png)
79+
80+
The analytics schema is designed for reporting and analytical workloads. It contains multiple materialized views optimized for business intelligence use cases:
81+
82+
- cumulative_releases: yearly and cumulative game release counts
83+
- top_reviewed_games_per_genre: top 5 reviewed games per genre
84+
- avg_rating_per_game: average ratings across all reviews per game
85+
- copy_availability_summary: availability count of game copies
86+
87+
All views are refreshed periodically and exposed to users with the `report_user` role with read-only access. These views support use cases like management dashboards and BI tools.
88+
89+
All analytic views and performance indexes are included in the SQL snippets provided and are kept isolated from transactional workloads to maintain efficiency and separation of concerns.
90+
91+
## Role Permissions
92+
-------------------
93+
94+
### Role Permissions Comparison
95+
96+
PostgreSQL uses a role based access control model to manage database security. Roles are similar to user groups and can be granted specific privileges such as reading, writing, or administrating data. In this project, roles were used to enforce the principle of least privilege, ensuring each type of user or process has only the permissions necessary for their function.
97+
98+
These roles govern what the user can access or modify in the database. This structured approach improves data security, simplifies permission management, and makes the system easier to audit.
99+
100+
Below is a breakdown of the roles used in this system and their respective permissions:
101+
102+
| Role Name | Description | Accessible Schemas | Allowed Actions | Assigned User |
103+
|---------------|----------------------------------------------|--------------------------|--------------------------------------------------|---------------------|
104+
| `app_reader` | Read-only role for viewing app data | `public` | `SELECT` on all tables | `game_reader` |
105+
| `app_writer` | Full CRUD for app, also writes audit logs | `public`, `audit` | `SELECT`, `INSERT`, `UPDATE`, `DELETE` on `public`; `INSERT` on `audit_log`; `SELECT` on `audit_log_id_seq` | `game_writer` |
106+
| `auditor` | Read-only access to audit logs | `audit` | `SELECT` on all audit tables | `audit_user` |
107+
| `report_user` | Analytics/reporting access | `analytics` | `SELECT` on all materialized views | `report_user_app` |
108+
| `backup_role` | Least-privilege access for backup automation | `public`, `audit`, `analytics` | `CONNECT`; `SELECT` on all tables and sequences; `USAGE` on all schemas | `backup_user` |
109+
110+
111+
112+
## SQL
113+
114+
### Triggers
115+
116+
The system uses PL/SQL triggers to audit changes made to critical tables.
117+
These triggers insert records into the audit.audit_log table whenever an INSERT, UPDATE, or DELETE operation happens.
118+
This enables real-time tracking of user modifications across important entities.
119+
120+
for example:
121+
122+
`CREATE TRIGGER trg_audit_videogame
123+
AFTER INSERT OR UPDATE OR DELETE ON public.videogames_register_videogame
124+
FOR EACH ROW
125+
EXECUTE FUNCTION audit_if_modified();`
126+
127+
You can find all the triggers in the SQL folder of this repository
128+
129+
### Indexes
130+
131+
To improve query performance, several indexes were added to optimize filtering.
132+
Indexes were defined based on profiling via EXPLAIN ANALYZE, and it is shown how the execution task is reduced for the most common queries:
133+
134+
`-- For grouping/searching by genre
135+
CREATE INDEX idx_genre_id ON public.videogames_register_videogame(genre_id);`
136+
137+
**Before Index**
138+
139+
![img_6.png](img_6.png)
140+
141+
**After Index**
142+
143+
![img_7.png](img_7.png)
144+
145+
146+
`-- For filtering by date
147+
CREATE INDEX idx_release_date ON public.videogames_register_videogame(release_date);`
148+
149+
150+
**Before Index**
151+
152+
![img_8.png](img_8.png)
153+
154+
**After Index**
155+
156+
![img_9.png](img_9.png)
157+
158+
`-- For searching by title
159+
CREATE INDEX idx_title_lower ON public.videogames_register_videogame(LOWER(title));`
160+
161+
**Before Index**
162+
![img_10.png](img_10.png)
163+
164+
**After Index**
165+
166+
![img_11.png](img_11.png)
167+
168+
169+
And several more indexes where added to all the tables and their must important columns, you can find them in the SQL folder.
170+
171+
172+
## Materialized Views
173+
174+
The analytics schema contains several materialized views designed to support reporting, trend analysis, and dashboard features.
175+
These views use PostgreSQL’s Window functions, aggregation, and ranking
176+
177+
for example:
178+
179+
`CREATE MATERIALIZED VIEW analytics.cumulative_releases AS
180+
WITH yearly_counts AS (
181+
SELECT
182+
EXTRACT(YEAR FROM release_date)::INT AS year,
183+
COUNT(*) AS games_this_year
184+
FROM public.videogames_register_videogame
185+
GROUP BY EXTRACT(YEAR FROM release_date)
186+
)
187+
SELECT
188+
year,
189+
games_this_year,
190+
SUM(games_this_year) OVER (ORDER BY year) AS cumulative_total
191+
FROM yearly_counts
192+
ORDER BY year;
193+
`
194+
195+
This materialized view is also integrated with the Django application frontend
196+
197+
You can find the materialized views in the SQL folder of this repository.
198+
199+
## Backup/restore step by step
200+
201+
202+
First, open Task Scheduler on Windows.
203+
Create a new task:
204+
- **Trigger**: Daily at 02:00 AM
205+
- **Action**: Run `python.exe`
206+
- **Arguments**: `C:\path_to_backup.py`
207+
- **Start in**: `C:\path_to_our_project`
208+
209+
After that, just make sure that 'backup.py:
210+
- Stores SQL backups in backups/ folder if you run it
211+
- Pulls credentials securely from backup.env file
212+
213+
After that, is preferably better to have docker desktop installed and running. Then, you can either look for postgres in the searchbar, then pull and run. And enter the right paths for the backup.py.
214+
or, also you can run in powershell the following command:
215+
216+
`docker run -d --name demo
217+
-e POSTGRES_PASSWORD=demo_pass
218+
-v "C:\Users\youruser\PycharmProjects\Videogames_project\backups:/backups"
219+
postgres:latest`
220+
221+
which mounts the backup folder inside our container.
222+
223+
Then, to test it you can run:
224+
225+
`docker exec -it demo bash`
226+
227+
`su - postgres`
228+
`psql -c "DROP DATABASE IF EXISTS videogames_restore_test;"`
229+
230+
`psql -c "CREATE DATABASE videogames_restore_test;"`
231+
232+
`latest=$(ls -1t /backups/videogames_backup_*.sql | head -n1)`
233+
234+
`psql -d videogames_restore_test -f "$latest"`
235+
236+
`psql -d videogames_restore_test -c "SELECT COUNT(*) FROM public.videogames_register_videogame;"`
237+
238+
And the output should be a valid row count, for example 42
239+
240+
![img_12.png](img_12.png)
241+
242+
243+
## How the Components Work Together
244+
-----------------------------------
245+
This project is structured in a way where each component has a defined responsibility.
246+
The Django web app handles user interaction and data management, interacting with the public schema via the app_writer role.
247+
Also audit module logs changes in real time using triggers and stores them in a separate schema accessible only to the auditor role.
248+
The reporting module accesses materialized views in the analytics schema through the report_user.
249+
Backups are automated using a dedicated backup_user and a scheduled task, with Docker enabling isolated restoration and validation in a test container.
250+
251+
252+
253+
## Reason of The Chosen Architecture
254+
--------------------------------------
255+
256+
This layered architecture ensures separation of concerns, data security, and also operational resilience. By isolating schemas and limiting the role permissions, we apply the principle of least privilege, reducing risk in case of compromise.
257+
Audit logs and materialized views enhance accountability and analytical capabilities without affecting transactional performance. Apart from that, the use of Docker for restoration validation makes sure that backups are not only taken but also tested in a reproducible environment
258+

0 commit comments

Comments
 (0)