visitor (0 QPoints)
  • FR
  • EN
  • NL
  • DE
  • ES
315 experts, 1193 registered users, 1659 questions already answered
European Experts Exchange, the very best site for high-quality IT solutions

New Improved Search!

 


05/10/2011 1h30 : Steve Jobs is dead, the father of Apple ][ is gone, we are all orphaned.

Databases :: MySql :: Best way to store large volumes of data.


By: veloguy U.S.A.  Date: 13/10/2004 00:00:00  English  Points: 100 Status: Answered
Quality : Excellent
I need to store test results from multiple schools. There are 70 multiple choice questions per test and probably 300 students taking the test per school. With nationwide acceptance, this could amount to a very large amount of data.

I would like suggestions/opinions on the best way to store and manage that amount of data in MySQL. I will need to be able to pull the answers from each test group, each school district, and/or the whole country easily. Speed of running the reporting queries is not the most important factor. I'd rather have the data easy to manage.
By: kenfcamp Date: 13/10/2004 21:19:00 English  Type : Comment
Hello:

Interesting project you've got there .. heh

If it were me, I'd utilize 4 tables

Datebase - testing

1 - district/dist_code State (and or city)
2 - school/schol_code dist_code
3 - Student/stud_id schl_code
4 - test stud_id *test1 - 70

you'd be able to get the answer per student
per discrict
per school
or all

depending on how you requested the data to be retrieved.

Hope this helps

KC
By: VGR Date: 13/10/2004 21:55:00 English  Type : Answer
I'm not sure to understand your notation, so forgive me if I duplicate what you thought you wrote 8-)

I would do this :
1) table of school district codes
2) table of students per group [not necessary if you don't want to go down the "group" level]
3) table of groups per school district
you don't need a table for the country as there is only one

4) table of tests performed (eventually with questions/answers?)

5) your results data would hold the student ID, the test ID, his/her group ID, his/her results, eventually (to speed up searches) the school district ID
6) then to search : (quoting you)
[pull the answers from ]
a)each test group : SELECT * FROM results WHERE groupID=...
b)each school district : SELECT * FROM results WHERE schooldistrictID=... or in two queries if schooldistrictID is not in the results table :
SELECT groupID FROM groupsperschool WHERE schooldistrictcodeID=...
(loop on x in above) SELECT * FROM results WHERE groupID=x

c)the whole country easily : SELECT * FROM results;
(or a more complicated query, cascading and grouping data from tables :
schooldistrictcodes -> y
(as above on y)
SELECT groupID FROM groupsperschool WHERE schooldistrictcodeID=y
(loop on x in above) SELECT * FROM results WHERE groupID=x

regards
By: veloguy Date: 13/10/2004 22:34:00 English  Type : Comment
The question is not so much about the way to organize the schools, groups and references to them. It's about the best way to store the large volumes of answer data. 70 questions X 40 students X 40 classes X 100's of schools X 100's of districts can end up being a lot of data. Too much for any one table.

I was thinking that I would have 2 databases. One for the system functions, and another to store the answers. I could create a table for each test instance or class. That table would be named the classID (which would be mapped to the date taken/school/etc with a bridge table), question ID, and the answer. I'll need to be able to run queries like this:

select * from classID where age='14' and gender='f' and q14='agree'

Then I can see how many 14 year old females agree with question 14.

If I need to know that stat for the School I could get it from querying all tables with that SchoolID in the bridge table.

Sould like a good plan - I gave it a lot of thought last night :)

By: VGR Date: 14/10/2004 21:43:00 English  Type : Comment
"I would like suggestions/opinions on the best way to store and manage that amount of data in MySQL"

"The question is not so much about the way to organize the schools, groups and references to them"

OK.
By: kenfcamp Date: 14/10/2004 22:01:00 English  Type : Assist
*The question is not so much about the way to organize

On the contrary, with this amout of data, it's all about the organization, expecially if you desire to have flexibility w/ displaying results.

*It's about the best way to store the large volumes of answer data. 70 questions X 40 students X 40 classes X 100's of schools X 100's of districts can end up being
a lot of data. Too much for any one table.

This is correct, with the example structures above you have all the school/district/student information in 2-3 tables, the test results are in the 4th w/ 70 columns (1 for each question) and a column for the student identifier.

In this way you'll be able to search by school, district, student type (age) or by question/answer.

*I was thinking that I would have 2 databases. One for the system functions, and another to store the answers

If you want the table contents even smaller, you can create a table for each school district, or school, this will lower the amount of data (speed things up) but will require more work from you to make more complicated queries.

As far as speed, I've got tables with well over 1,000,000 pieces of data, you shouldn't have any problem w/ a few thousand.

KC
By: Buraque Date: 14/10/2004 22:30:00 English  Type : Assist
I would be as detailed as I can, because of the possibilty of future needs...
My tables would be like this:

1- CREATE TABLE Test(Q_ID TINY INT, Q_Content TEXT, A TEXT, B TEXT, C TEXT, D TEXT, E TEXT, Right_Answer ENUM("A","B","C","D","E");

2- CREATE TABLE Student(Student_ID INT, St_Name CHAR(100), School_ID INT, Test_Score DEC(3,2));

3- CREATE TABLE School(School_ID INT, School_Name CHAR(50), District_ID INT, Test_Date DATE);

4- CREATE TABLE Districts(District_ID INT, District_Name CHAR(50), State CHAR(30));

5- CREATE TABLE Student_Answers(Student_ID INT, Q_ID INT, Selected_Answer SET("A","B","C","D","E"), Result SET("Right", "Wrong"));

You can pick even the most detailed info from this 5 tables...
By: Buraque Date: 14/10/2004 22:32:00 English  Type : Comment
OOPS!
Sorry in table 5, SET should be ENUM....

Do register to be able to answer

EContact
browser fav
page generated in 671.406030 milliseconds

Why Google AdSense ads ?

compteur
 Ranking-Hits PageRank for this page