LoginLogin
Might make SBS readonly: thread

Inconsistent memory usage on array of strings?

Root / SmileBASIC Bug Reports / [.]

raimondzCreated:
Working on SmileIde, I noticed an strange behavior when an array of string stores a lot of data. On the pictures below are some test that check how much memory is used if I store the same content in different ways (As string and an array of string where the concatanated content should be equal to the string) and the implementation of the test. At some point the array start to use more memory than the string.

This both surprises and does not surprise. Instinctually, I want to say it's because the array allocates memory to store all of its elements, whereas the string only needs to allocate one spot (or maybe not? Allocation must always take place since strings can change the amount of bytes they contain?) I'm not much of a computer scientist, but I'd say that discrepancy is due to the array having to allocate and store to many places in memory. This is good to note. I've always used string arrays to store large text files because it was easy. Maybe there's a better way ... Say, write a function that turns a string into an array? Pass it a position and it returns a chunk of text.

(As string and an array of string where the concatanated content should be equal to the string)
Not true. Keep in mind that you have to be able to distinguish between ["FFF","FFF","FFF"] and ["FFFFF","FF","FF"], so even if all the data was stored contiguously (which is very unlikely for various reasons) you would still need to store lengths for each string, on top of storing a length for the array itself. I think there's something wrong with the way you're measuring this, though, especially given a 50-item array comes up as 0 bytes. It also doesn't make sense for the string to take up more space than the array at any point.

For the record, I imagine the data structures look something vaguely like this for SmileBASIC:
struct string {
 u32 refcount;
 u32 length;
 u16 contents[]; 'UTF-16
}

struct string_array {
 u32 refcount;
 u32 length;
 string* contents[];
}

This is good to note. I've always used string arrays to store large text files because it was easy. Maybe there's a better way ... Say, write a function that turns a string into an array? Pass it a position and it returns a chunk of text.
Yeah, now I'm using a single string and an array of integer that store the position of each linebreak. That use way less memory than the string array. @niconii I also found it weird that the array of 50 strings return 0 bytes, that's why I posted the code on first place. That pic contains all the code of the test. I thought that it could be a linked list(For one dimensional arrays) since they support push/unshift and pop/shift operations. From my experience on other programming language, you need to create a new instance(allocate memory) of an array if you want to change their size, then copy the content of the old instance. However, I didn't beleive that the pointers between nodes add too much size... thats why I posted this issue. Anyway, I can't say if this is a bug or not since I don't know how they implemented the arrays.

Yeah, now I'm using a single string and an array of integer that store the position of each linebreak. That use way less memory than the string array.
I've already begun work on a library for this :D I've been wanting a versatile way to edit 2-D arrays, and I think by using this method I can create a function library that treats strings as 2-D arrays, and uses basic string manipulation to edit them. I don't know why I didn't consider this before. The only drawback to this implementation is the dramatic speed decrease that comes with excessive string manipulation.