First-time database design: am I overengineering? [closed]

Background

I’m a first year CS student and I work part time for my dad’s small business. I don’t have any experience in real world application development. I have written scripts in Python, some coursework in C, but nothing like this.

My dad has a small training business and currently all classes are scheduled, recorded and followed up via an external web application. There is an export/”reports” feature but it is very generic and we need specific reports. We don’t have access to the actual database to run the queries. I’ve been asked to set up a custom reporting system.

My idea is to create the generic CSV exports and import (probably with Python) them into a MySQL database hosted in the office every night, from where I can run the specific queries that are needed. I don’t have experience in databases but understand the very basics. I’ve read a little about database creation and normal forms.

We may start having international clients soon, so I want the database to not explode if/when that happens. We also currently have a couple big corporations as clients, with different divisions (e.g. ACME parent company, ACME healthcare division, ACME bodycare division)

The schema I have come up with is the following:

  1. From the client perspective:
    • Clients is the main table
    • Clients are linked to the department they work for
      • Departments can be scattered around a country: HR in London, Marketing in Swansea, etc.
      • Departments are linked to the division of a company
    • Divisions are linked to the parent company
  2. From the classes perspective:
    • Sessions is the main table
      • A teacher is linked to each session
      • A statusid is given to each session. E.g. 0 – Completed, 1 – Cancelled
      • Sessions are grouped into “packs” of an arbitrary size
    • Each packs is assigned to a client

I “designed” (more like scribbled) the schema on a piece of paper, trying to keep it normalised to the 3rd form. I then plugged it into MySQL Workbench and it made it all pretty for me:
(Click here for full-sized graphic)

alt text
(source: maian.org)

Example queries I’ll be running

  • Which clients with credit still left are inactive (those without a class scheduled in the future)
  • What is the attendance rate per client/department/division (measured by the status id in each session)
  • How many classes has a teacher had in a month
  • Flag clients who have low attendance rate
  • Custom reports for HR departments with attendance rates of people in their division

Question(s)

  • Is this overengineered or am I headed the right way?
  • Will the need to join multiple tables for most queries result in a big performance hit?
  • I have added a ‘lastsession’ column to clients, as it is probably going to be a common query. Is this a good idea or should I keep the database strictly normalised?

Thanks for your time

11 Answers
11

Leave a Comment