The database structure
In the previous post I outlined the system’s specs and use cases. We also selected the primary use case we should start from – that is filtering screen.
The complexity with DB design here is that we want maximum flexibility on the one hand and we don’t want to lose much in performance on the other hand.
Let’s recap what data about student we should store:
- Phone
- Password
- ICQ
- Name
- Surname
- Patronymic
- Gender
- Birthday
- Faculty
- Course
- Group
- Clubs
- Tags
- OK to send SMS (true/false)
Items 1-9 and #15 are scalar values and can be easily stored in User table as fields of the correspondent type. While Faculty, Course, Group and Clubs are either N-M (Clubs) or 1-N relations. So obvious way to handle them is to create tables and cross-table for N-M relation. Tags can be also represented as N-M relation.
It seems OK until we don’t want to extend the system by adding some more fields. But even without extensibility in mind, we should create 3 separate tables (for courses, faculties and clubs). However if we look closer at these, we’ll see that all of them are kind of groups and we can create a kind of hierarchy:
- University
- Faculty
- Course
- Group
- Course
- Club
- Section
- Faculty
So we can consider everything to be group. With this approach we also store information about relations between these groups.
With such approach we can add more types of groups in the future if needed. Or this solution can be ported to another domain model (e.g. school or business organization) without changes in the structure. We should only define, which groups are required for user. In this case Faculty, course and group are required. However, knowing the group, we can get both course and faculty. But I’m not sure if that is a good idea because this way we’ll have to make several queries to the DB in order to get faculty instead of one.
Here’s the DB scheme we came up with:
Filtering it
Let’s try to compose some simple queries to test if our DB can cope with them.
- Let’s get all male users who’re elder than 18:
-
SELECT * FROM User WHERE gender=‘male’ AND birthday <= ’1992-03-19 23:59:59′
Date for the birthday is set to the 18 years back from now. Everyone who was born before or on 19 of March 1992 will fall into the condition. And gender is obvious.
-
- Let’s try to get all male that have first names starting with ‘K’ and who study on the first or second course:
-
SELECT u.* FROM User u INNER JOIN UserGroup ug ON u.userID=ug.userID WHERE u.gender=‘male’ AND u.name LIKE ‘K%’ AND ug.groupID IN (5,6)
Suppose that in Group table we have groups for the first course and second course and they have Ids 5 and 6 respectively.
-
- Let’s try to search for all who is on the 3rd course and is also in Geeks’ Club:
-
SELECT u.* FROM User u INNER JOIN UserGroup ug ON u.userID=ug.userID WHERE ug.groupID IN (7, 15)
7 is ID of the group “3rd course” and 15 is ID of the group “Geeks’ Club”.
We have the problem with the last query
It will find all users who either studies on the 3rd course or attends Geeks’ Club. However we need 3rd course and Geeks’ club.
There are two possible solutions here. One is SQL-based, another is PHP-based.- 1.SQL-based:
-
SELECT * FROM User WHERE userID IN (SELECT userID FROM UserGroup WHERE userID IN(SELECT userID FROM UserGroup WHERE groupID=15) AND groupID=7)
Here we select users who have group=15. Then from these users we select the ones who have group=7 as well, and finally we select user records that correspond the userIDs found.
I think, the variant with INNER JOIN is also possible here (at least, joining results of the groups filtering instead of sub-querying). It just takes some time to figure it out and it has to deal more with optimization than with the principle of filtering.
Anyway, the more groups we have here, the more complex this query becomes.
UPDATE: Here I’ve been advised to use the following query which is more simple and efficient:-
SELECT
-
u.*
-
FROM
-
USER u
-
INNER JOIN
-
UserGroup ug1
-
ON
-
u.id = ug1.userID AND ug1.groupID=7
-
INNER JOIN
-
UserGroup ug2
-
ON
-
u.id = ug2.userID AND ug2.groupID=15
I think this is the best way of making such queries. Great thanks to ScallioXTX
-
- PHP-based. Basically, the principle is the same:
-
$group7 = getUsersForGroup(7);
-
$group15 = getUsersForGroup(15);
-
$users = getUsersByIDs($userIDs);
getUsersForGroup actually executes the query:
-
SELECT userID FROM UserGroup WHERE groupID=[GROUP]
getUsersByID executes the following query:
-
SELECT * FROM User WHERE userID IN ([ids])
array_intersect is a PHP function that returns the values present in all given arrays, returns the intersection of sets.
For example, if we give it 3 arrays:
(1, 5, 8, 15)
(5, 8, 15, 19)
(1, 5, 15, 25)
it will return (5, 15).
In this case we execute several little queries instead of one big. In MySQL this often gives benefits in performance. -
-
So that’s our filtering approach.
There is one thing to add here. If we have group #1 and #2 and group#2 is a child of #1. And user selected to filter users which are both in #1 and #2, then we should perform a little optimization and search only for #2 because that will be the correct answer to the query. This situation is stupid, but we should act smart even in a stupid queries
Maybe there are better solutions. We’re open to them! Moreover, we realize that there are more experienced developers that already solved such problem. Help us
Tomorrow we have a meeting where we’ll be discussing the architecture, classes etc. I’ll upload our results on the weekends or early next week. This depends on the amount of work done during the meeting.
Stay tuned! Subscribe to my blog’s RSS, share your opinions/suggestions in comments and don’t miss the next post in this series!
No related posts.
Related posts brought to you by Yet Another Related Posts Plugin.
Share this post with a friend













Leave a Reply