|
| 1 | +## Executive Summary |
| 2 | + |
| 3 | +This project enhances Threading Labs’ existing Django-based video game catalog application by implementing advanced PostgreSQL database administration techniques to make the system production ready. The primary objective was to introduce robust data control, auditing, performance tuning, and disaster recovery capabilities. |
| 4 | + |
| 5 | +Key achievements include the design and enforcement of least-privilege access through custom PostgreSQL roles, so that each role has only the permissions necessary to perform its own responsibilities. Real-time audit logging was implemented via PL/SQL triggers, allowing the system to capture and store modifications to sensitive tables in a dedicated audit schema. |
| 6 | + |
| 7 | +To support business intelligence and reporting, analytics-ready materialized views were developed. These views allow for efficient querying of aggregated data without placing additional load on transactional tables. Query performance was improved by identifying bottlenecks using and adding indexes on critical columns that were frequently accessed by the application. |
| 8 | + |
| 9 | +An automated backup and restore mechanism was established, with backups scheduled via Windows Task Scheduler. Restore validation is performed in an isolated Docker container to confirm the integrity and completeness of backup files without affecting the production system. |
| 10 | + |
| 11 | +This work not only improves data security and visibility but also supports analytics through a reporting module and ensures a resilient backup strategy. |
| 12 | + |
| 13 | +Table of Contents |
| 14 | +----------------- |
| 15 | +- [Architecture](#architecture) |
| 16 | +- [Schemas](#schemas) |
| 17 | +- [Role Permissions](#role-permissions) |
| 18 | +- [SQL](#sql) |
| 19 | +- [Backup/Restore Process](#backuprestore-step-by-step) |
| 20 | +- [How The Components Work Together](#how-the-components-work-together) |
| 21 | + |
| 22 | +## Architecture |
| 23 | +----------------- |
| 24 | +The system is built around a Django Web Application composed of: |
| 25 | + |
| 26 | +- **App Module**: Manages transactional operations via the app_writer role. |
| 27 | +- **Reporting Module**: Queries optimized materialized views via the app_reader role. |
| 28 | +- **Audit Module**: Captures and records DML events into an audit table using the auditor role. |
| 29 | +- **Backup Scheduler**: Periodically dumps the entire database using the backup_user role and stores it in external backup storage (the folder in the host machine mounted into the docker container). |
| 30 | + |
| 31 | +The PostgreSQL database is divided into: |
| 32 | +- Public schema for main application data |
| 33 | +- Analytics schema for materialized views and reporting |
| 34 | +- Audit schema for audit logging |
| 35 | + |
| 36 | +Each role is scoped to its own module for a strong security boundary. |
| 37 | + |
| 38 | +## Schemas |
| 39 | +-------- |
| 40 | +### `public` schema |
| 41 | + |
| 42 | + |
| 43 | + |
| 44 | +This is the main transactional schema used by the Django application. It includes all core entities required for managing the video game catalog and the user interactions. |
| 45 | + |
| 46 | +It contains tables such as: |
| 47 | + |
| 48 | +- videogame table: Stores all videogame records. |
| 49 | +- genre, review, copy, userprofile, developer tables: Supporting entities linked to games and users. |
| 50 | +- Standard Django authentication and permission tables: auth_user, auth_group, auth_permission, django_session |
| 51 | + |
| 52 | +Foreign keys for referential integrity between users, games, genres, and other related entities. This schema supports full CRUD operations required by the web application. |
| 53 | + |
| 54 | +### `audit` schema |
| 55 | + |
| 56 | + |
| 57 | + |
| 58 | +The audit schema is designed to capture all critical data changes for compliance and traceability. |
| 59 | + |
| 60 | +It includes an audit_log table which includes all the CRUD operations done on the public schema. |
| 61 | + |
| 62 | +Triggers are attached to all critical tables: |
| 63 | +- videogame |
| 64 | +- genre |
| 65 | +- auth_user |
| 66 | +- review |
| 67 | +- developer |
| 68 | +- copy |
| 69 | +- userprofile |
| 70 | + |
| 71 | +Each after trigger writes a record to the `audit.audit_log` table for insert, update, or delete actions, capturing the table name, operation, changed data, timestamp, and user. |
| 72 | + |
| 73 | +The `auditor` role has only read access to this schema, and `app_writer` is granted with insert permissions that are necessary for logging. |
| 74 | + |
| 75 | + |
| 76 | +### `Analytics` Schema |
| 77 | + |
| 78 | + |
| 79 | + |
| 80 | +The analytics schema is designed for reporting and analytical workloads. It contains multiple materialized views optimized for business intelligence use cases: |
| 81 | + |
| 82 | +- cumulative_releases: yearly and cumulative game release counts |
| 83 | +- top_reviewed_games_per_genre: top 5 reviewed games per genre |
| 84 | +- avg_rating_per_game: average ratings across all reviews per game |
| 85 | +- copy_availability_summary: availability count of game copies |
| 86 | + |
| 87 | +All views are refreshed periodically and exposed to users with the `report_user` role with read-only access. These views support use cases like management dashboards and BI tools. |
| 88 | + |
| 89 | +All analytic views and performance indexes are included in the SQL snippets provided and are kept isolated from transactional workloads to maintain efficiency and separation of concerns. |
| 90 | + |
| 91 | +## Role Permissions |
| 92 | +------------------- |
| 93 | + |
| 94 | +### Role Permissions Comparison |
| 95 | + |
| 96 | +PostgreSQL uses a role based access control model to manage database security. Roles are similar to user groups and can be granted specific privileges such as reading, writing, or administrating data. In this project, roles were used to enforce the principle of least privilege, ensuring each type of user or process has only the permissions necessary for their function. |
| 97 | + |
| 98 | +These roles govern what the user can access or modify in the database. This structured approach improves data security, simplifies permission management, and makes the system easier to audit. |
| 99 | + |
| 100 | +Below is a breakdown of the roles used in this system and their respective permissions: |
| 101 | + |
| 102 | +| Role Name | Description | Accessible Schemas | Allowed Actions | Assigned User | |
| 103 | +|---------------|----------------------------------------------|--------------------------|--------------------------------------------------|---------------------| |
| 104 | +| `app_reader` | Read-only role for viewing app data | `public` | `SELECT` on all tables | `game_reader` | |
| 105 | +| `app_writer` | Full CRUD for app, also writes audit logs | `public`, `audit` | `SELECT`, `INSERT`, `UPDATE`, `DELETE` on `public`; `INSERT` on `audit_log`; `SELECT` on `audit_log_id_seq` | `game_writer` | |
| 106 | +| `auditor` | Read-only access to audit logs | `audit` | `SELECT` on all audit tables | `audit_user` | |
| 107 | +| `report_user` | Analytics/reporting access | `analytics` | `SELECT` on all materialized views | `report_user_app` | |
| 108 | +| `backup_role` | Least-privilege access for backup automation | `public`, `audit`, `analytics` | `CONNECT`; `SELECT` on all tables and sequences; `USAGE` on all schemas | `backup_user` | |
| 109 | + |
| 110 | + |
| 111 | + |
| 112 | +## SQL |
| 113 | + |
| 114 | +### Triggers |
| 115 | + |
| 116 | +The system uses PL/SQL triggers to audit changes made to critical tables. |
| 117 | +These triggers insert records into the audit.audit_log table whenever an INSERT, UPDATE, or DELETE operation happens. |
| 118 | +This enables real-time tracking of user modifications across important entities. |
| 119 | + |
| 120 | +for example: |
| 121 | + |
| 122 | +`CREATE TRIGGER trg_audit_videogame |
| 123 | +AFTER INSERT OR UPDATE OR DELETE ON public.videogames_register_videogame |
| 124 | +FOR EACH ROW |
| 125 | +EXECUTE FUNCTION audit_if_modified();` |
| 126 | + |
| 127 | +You can find all the triggers in the SQL folder of this repository |
| 128 | + |
| 129 | +### Indexes |
| 130 | + |
| 131 | +To improve query performance, several indexes were added to optimize filtering. |
| 132 | +Indexes were defined based on profiling via EXPLAIN ANALYZE, and it is shown how the execution task is reduced for the most common queries: |
| 133 | + |
| 134 | +`-- For grouping/searching by genre |
| 135 | +CREATE INDEX idx_genre_id ON public.videogames_register_videogame(genre_id);` |
| 136 | + |
| 137 | +**Before Index** |
| 138 | + |
| 139 | + |
| 140 | + |
| 141 | +**After Index** |
| 142 | + |
| 143 | + |
| 144 | + |
| 145 | + |
| 146 | +`-- For filtering by date |
| 147 | +CREATE INDEX idx_release_date ON public.videogames_register_videogame(release_date);` |
| 148 | + |
| 149 | + |
| 150 | +**Before Index** |
| 151 | + |
| 152 | + |
| 153 | + |
| 154 | +**After Index** |
| 155 | + |
| 156 | + |
| 157 | + |
| 158 | +`-- For searching by title |
| 159 | +CREATE INDEX idx_title_lower ON public.videogames_register_videogame(LOWER(title));` |
| 160 | + |
| 161 | +**Before Index** |
| 162 | + |
| 163 | + |
| 164 | +**After Index** |
| 165 | + |
| 166 | + |
| 167 | + |
| 168 | + |
| 169 | +And several more indexes where added to all the tables and their must important columns, you can find them in the SQL folder. |
| 170 | + |
| 171 | + |
| 172 | +## Materialized Views |
| 173 | + |
| 174 | +The analytics schema contains several materialized views designed to support reporting, trend analysis, and dashboard features. |
| 175 | +These views use PostgreSQL’s Window functions, aggregation, and ranking |
| 176 | + |
| 177 | +for example: |
| 178 | + |
| 179 | +`CREATE MATERIALIZED VIEW analytics.cumulative_releases AS |
| 180 | +WITH yearly_counts AS ( |
| 181 | + SELECT |
| 182 | + EXTRACT(YEAR FROM release_date)::INT AS year, |
| 183 | + COUNT(*) AS games_this_year |
| 184 | + FROM public.videogames_register_videogame |
| 185 | + GROUP BY EXTRACT(YEAR FROM release_date) |
| 186 | +) |
| 187 | +SELECT |
| 188 | + year, |
| 189 | + games_this_year, |
| 190 | + SUM(games_this_year) OVER (ORDER BY year) AS cumulative_total |
| 191 | +FROM yearly_counts |
| 192 | +ORDER BY year; |
| 193 | +` |
| 194 | + |
| 195 | +This materialized view is also integrated with the Django application frontend |
| 196 | + |
| 197 | +You can find the materialized views in the SQL folder of this repository. |
| 198 | + |
| 199 | +## Backup/restore step by step |
| 200 | + |
| 201 | + |
| 202 | +First, open Task Scheduler on Windows. |
| 203 | +Create a new task: |
| 204 | + - **Trigger**: Daily at 02:00 AM |
| 205 | + - **Action**: Run `python.exe` |
| 206 | + - **Arguments**: `C:\path_to_backup.py` |
| 207 | + - **Start in**: `C:\path_to_our_project` |
| 208 | + |
| 209 | +After that, just make sure that 'backup.py: |
| 210 | + - Stores SQL backups in backups/ folder if you run it |
| 211 | + - Pulls credentials securely from backup.env file |
| 212 | + |
| 213 | +After that, is preferably better to have docker desktop installed and running. Then, you can either look for postgres in the searchbar, then pull and run. And enter the right paths for the backup.py. |
| 214 | +or, also you can run in powershell the following command: |
| 215 | + |
| 216 | +`docker run -d --name demo |
| 217 | + -e POSTGRES_PASSWORD=demo_pass |
| 218 | + -v "C:\Users\youruser\PycharmProjects\Videogames_project\backups:/backups" |
| 219 | + postgres:latest` |
| 220 | + |
| 221 | +which mounts the backup folder inside our container. |
| 222 | + |
| 223 | +Then, to test it you can run: |
| 224 | + |
| 225 | +`docker exec -it demo bash` |
| 226 | + |
| 227 | +`su - postgres` |
| 228 | +`psql -c "DROP DATABASE IF EXISTS videogames_restore_test;"` |
| 229 | + |
| 230 | +`psql -c "CREATE DATABASE videogames_restore_test;"` |
| 231 | + |
| 232 | +`latest=$(ls -1t /backups/videogames_backup_*.sql | head -n1)` |
| 233 | + |
| 234 | +`psql -d videogames_restore_test -f "$latest"` |
| 235 | + |
| 236 | +`psql -d videogames_restore_test -c "SELECT COUNT(*) FROM public.videogames_register_videogame;"` |
| 237 | + |
| 238 | +And the output should be a valid row count, for example 42 |
| 239 | + |
| 240 | + |
| 241 | + |
| 242 | + |
| 243 | +## How the Components Work Together |
| 244 | +----------------------------------- |
| 245 | +This project is structured in a way where each component has a defined responsibility. |
| 246 | +The Django web app handles user interaction and data management, interacting with the public schema via the app_writer role. |
| 247 | +Also audit module logs changes in real time using triggers and stores them in a separate schema accessible only to the auditor role. |
| 248 | +The reporting module accesses materialized views in the analytics schema through the report_user. |
| 249 | +Backups are automated using a dedicated backup_user and a scheduled task, with Docker enabling isolated restoration and validation in a test container. |
| 250 | + |
| 251 | + |
| 252 | + |
| 253 | +## Reason of The Chosen Architecture |
| 254 | +-------------------------------------- |
| 255 | + |
| 256 | +This layered architecture ensures separation of concerns, data security, and also operational resilience. By isolating schemas and limiting the role permissions, we apply the principle of least privilege, reducing risk in case of compromise. |
| 257 | +Audit logs and materialized views enhance accountability and analytical capabilities without affecting transactional performance. Apart from that, the use of Docker for restoration validation makes sure that backups are not only taken but also tested in a reproducible environment |
| 258 | + |
0 commit comments