LoginLogin

Obfuscation discussion

Root / General / [.]

SimeonCreated:
Does anyone have experience with making an obfuscator? I'd like to try making one, because if successful, people would be able to compress and rename all variables and functions in their code automatically, and make it nearly impossible to understand This would be a really nice program for SmileBASIC My thoughts on how it might get done:
  • Tokenize everything into an array
  • Find the beginning and end of every scope (functions)
  • In the global scope, detect every variable name and assign it a new name (a,b,c,...,aa,ab,ac,...,ba,bb,bc,...) (This will be done by converting from base-10 to base-26)
  • Then go through each scope and continue finding variable names, but this time undo everything upon leaving the scope
  • Keep track of every index of every variable name replacing at the end, or replace as we go?
Should it be recursive? Does anybody have experience with this? Would anybody like to help? I'm new to this field, and see this as one of the necessary things SmileBASIC needs, but doesn't seem to have, and I've been waiting for someone else to make one for a few years now, but nobody seems to have Questions, comments, concerns? Comment below

"but why?" Yeah, it would be a fun toy project for someone learning... but it almost sounds like you intend to use it in production.

"but why?" Yeah, it would be a fun toy project for someone learning... but it almost sounds like you intend to use it in production.
yeah i agree. WHY, SIMEON WOULD YOU USE IT IN PRODUCTION

"but why?" Yeah, it would be a fun toy project for someone learning... but it almost sounds like you intend to use it in production.
yeah i agree. WHY, SIMEON WOULD YOU USE IT IN PRODUCTION
My guess would be to release code that you don't want people to copy. Given SmileBASIC's policy it's not that surprising. Obfuscation with JavaScript is pretty popular as well. You could also build your obfuscator to heavily compress your code, which has its own benefits.

"but why?" -snip-
My guess would be to release code that you don't want people to copy. Given SmileBASIC's policy it's not that surprising. Obfuscation with JavaScript is pretty popular as well. You could also build your obfuscator to heavily compress your code, which has its own benefits.
Is the answer I expect, but I was hoping Simeon would answer. "don't want people to copy" --again, why? Keep in mind this [more easily] deters modification than actual distribution. Edit: I'll point out again that I don't think this is a bad project for learning e.g. parsing or SmileBASIC's quirks, but actually intending to use it is another matter. Why do you want to protect code that you're not selling?

One of the few things I can think of is minification, but that isn't quite the same thing.

Personally, I think the openness of SmileBASIC as a platform is one of its greatest strengths. A big part of SmileBASIC is that people are able to learn from your code, take it and improve on it, or use it to make something new, and you can see that in SmileBoom's policies. From that point of view, I don't think SmileBASIC needs an obfuscator at all. On the contrary, I think it would ultimately be harmful.

Lumage, How would this project not be a good one? There are many good reasons to do this one being to learn a bit about parsing and variable scopes.

Lumage, How would this project not be a good one? There are many good reasons to do this one being to learn a bit about parsing and variable scopes.
Edit: I'll point out again that I don't think this is a bad project for learning e.g. parsing or SmileBASIC's quirks, but actually intending to use it is another matter. Why do you want to protect code that you're not selling?

Okok, so, it could be harmful for SmileBASIC, if it gets used commonly The bad side
  • SmileBASIC may be less beginner friendly
  • Curious programmers can't be curious anymore
On the other hand
  • Automatic code minification
  • Non-editable games to prevent cheating
  • I get to learn how to parse better

  • Non-editable games to prevent cheating
I think that editing games is basically a right by now. The only people that would use the obfuscator would be people who want to protect their code— either arrogant beginners or arrogant experts. The problem with noobs obfuscating their code is that they often have unseen bugs that break gameplay like a minute in, while the problem with pros obfuscating their code is that they have things we can learn from, and they’re hiding it from us lowly plebeians.

  • Automatic code minification
Graphics size is bound to be an issue far sooner than code size, since a GRP file is 512 KB and the project upload size limit is only 4 MB (20 MB if you have a gold account). However, in the event code size still needs to be reduced, you could just compress the code rather than minify it. This has the benefit of being reversible, and may even result in a smaller file size than minification would, due to redundancy in things like keywords and repeated names.
  • I get to learn how to parse better
You don't have to publish the obfuscator or obfuscated projects to do that.

Good points good points

What's the difference between encrypting a project and obfuscating a project?

What's the difference between encrypting a project and obfuscating a project?
Encryption is a 2 way process. You encrypt your code to protect it from checks by SmileBOOM, and the user decrypts it before use. Obfuscation is a 1 way process. You obfuscate your code to protect it from modifications by making the code hard to follow but still valid, and the user runs it as is.

I was actually surprised to see some debate over the—er—ethics of such a program. Here are my thoughts. The program seems to utilize multiple functions. Miniaturization of the code, and confusion of the code. For the first function, I think a code miniaturizer would come useful to some people who want to cut their file sizes. A miniaturizer could do thinks like cut out comments, remove unnecessary spaces and indents, and condense function and variable names, perhaps even constants as well (ex. "0.0"->".", "10000000"->"1E7"). Users have the choice of making their code readable, and when it comes to this kind of open platform, we kind of have to accept that users will make responsible decisions when doing so. A code obfuscator would not break SmileBASIC, I doubt anyone but a few would use it anyway. When it comes to the confusion aspect of it's functionality, I have a couple ideas. If you decide you don't care about filesize, you could do more than miniaturizer on to confuse people. SmileBASIC likes to move the parsing cursor to the right and down, and that makes it easy for people to follow the code. But what if instead when the cursor reaches an endline, it doesn't move down, but to some random place in the code where the next corresponding line is. There would have to be a kind of GOTO @LOCATION token replacing the endline that would direct it to the next line of code. This could possibly be implemented using actual GOTO's, kind of like . . .
@39 BEEP 34 GOTO@1
@67 SPDEF 0,0 GOTO@821
@254 X1=X0 GOTO@305
I'm not sure SmileBASIC would like this so much. Pretty sure it would break FOR and WHILE loops. DEF blocks would be handled internally. There might be other ways to handle it, especially for those loops. I don't have my 3ds atm, so I can't test it, but it's an idea. Developing a SmileBASIC parser is a good idea. If built into a multipurpose engine, it would have a variety of uses in programs like custom IDEs and whatnot.

-SNIP-Pretty sure it would break FOR and WHILE loops. -The legend of SNIP:ocarina of time-
At least the REPEAT UNTIL loops are safe!

What about def loops?
DEF A
 B
 END

DEF B
 A
 END

A

Code files are so tiny compared to GRPs that there's really no point in minifying them. A single GRP file is about 524 KB, and a code file of that size would need over 20 thousand lines of 20 characters each. Minifying the code might reduce the size to around 1/2 to 1/3, but there are better ways to reduce file size and it's also just annoying to anyone reading your code. Plus, it increases the risk that the original program might be lost, which has happened before with compiled lowerdash projects. And any argument about preventing people from stealing code is pretty much pointless because you're not making money off of it, so it really doesn't matter, as well as the fact that it's very easy to prove someone has stolen code.

What about def loops?
DEF A
 B
 END

DEF B
 A
 END

A
That's just called recursion.