How do I deal with a lot of text data in C2?

  • Hi everyone !

    I'm trying to build an interface for a lexical database of Mandarin. Basically, I have huge text files with data separated by tabs. I use the rex_csv plug in to read them in C2 and I built the interface to make queries on this database.

    (rex_csv plugin: c2rexplugins.weebly.com/rex_csv.html )

    The interface allows you to search in two different ways:

    1. Search for info by words: you type in words, and if it is found in the database, you get all the data concerning those words.

    2. Search for words by info: you pick all the criteria you need and you get all the words in the database that match those criteria.

    Once you've done your query, you can download a file with all the data you need (you can pick what data shows up in the file).

    So I made this .capx. It's a draft, but everything works as it should. It's better than long explanations:

    drive.google.com/open

    In this draft, the database I use is a very small text file (TESTV3.txt in the .capx). In short, it's made of 6 rows (i.e. 6 words) with 5 columns (5 different pieces of informations for each).

    Now, I want to scale that up to a text files with approximately 100 000 rows and 300 columns. That's a lot of data. Which makes the AJAX request for the file fail. I get a javascript error. I can't recreate it right now, but it was something like: "allocation size overflow localhost :50000/myfile.txt on line 185 col 1".

    I think I have been too optimistic on the power of Construct 2. Maybe it can not deal with so much data? If so, do you have any recommandation on what tools I should use to get my interface working ?

  • How big is the text file? I'm guessing hundreds of megabytes?

    You can try splitting it into many smaller parts. And maybe create one or several index files. So, for example, when you need data for the word "cap", you check the index file, it tells you which file contains all the details for this word (say, db0123.txt). Then you request db0123.txt with AJAX and load it into CSV.

    It may be slow, but hopefully will work.

    .

    Another (better) solution would be moving the whole thing into a MySQL database and communicating with it via PHP or JS.

    .

    By the way, you should probably compress some data in the file, storing each letter of the word in a separate column is not very efficient.

  • Try Construct 3

    Develop games in your browser. Powerful, performant & highly capable.

    Try Now Construct 3 users don't see these ads
  • Hey! Thanks for helping me again. See, I made use of the csv plug in you suggested to me before!

    The files are about 250MB and I have three of them. So that would be a loooot of different parts. And I guess the thing would still be slow as hell in the end.

    I guess I'll look into MySQL. Sounds like a better option for that purpose.

    The TEST.txt is just a silly example. The data in there is just nonsense. See letters as nominal data and "swag" and "awesomeness" as numerical data. I could have named them "characteristic 1" to "characteristic 5" with random characters/numbers in the cells.

Jump to:
Active Users
There are 1 visitors browsing this topic (0 users and 1 guests)