Author Topic: designtalk: Sanitized Expression Evaluation in Python  (Read 158 times)

Code Refugee

  • Wise Sage
  • *****
  • Posts: 1489
  • To Serve Man
    • View Profile
designtalk: Sanitized Expression Evaluation in Python
« on: June 22, 2017, 07:17:02 AM »
Since we are a tight knit religious community based around the communochaotic teachings of Kek, I thought I would share with my fellow community members a secret ritual device I created that has served me well in many different scenarios.

Many times when processing user data it would be very convenient for the user to be able to enter their own formulaic expressions and evaluate those down to a real number, just like a spreadsheet does.

But a danger is that malignant users are clever at constructing input that escapes the pen, roots the server, and exerts its own control. Thus they must be sanitized and sandboxed, a tedious and error prone process.

Thus here is a simple expression evaluation that can handle arbitrary user input without allowing escalation of privileges or access to system functions.

Code: [Select]
def calculator(expression, **context):
    context.update({"__builtins__": {}})
    return float(eval(expression, context))

This is remarkably short code for all that it does. In Python, eval() evaluates python code given as a string. As we know, this is a very dangerous practice when the string comes from outside our control! Normally eval() inherits the symbol table from the point it is called, which includes lots of dangerous things like access to system functions. Fortunately eval() has an optional second argument that overrides that symbol table. Unfortunately if eval() is given a symbol table without the built in functions, it adds them in itself. Fortunately, if you specify the built in functions reference to be nothing, then it short-circuits that and no surrounding context, not even built-in global functions, are pulled in.

In addition to short-circuiting the built-ins, it's also nice to be able to easily supply specific symbols that are useful for a particular expression. We do this through the use of Python's keyword-argument (kwarg) facility for functions, which allows functions to have arbitrary numbers of named arguments.

Here's a couple examples of use.

Code: [Select]
>> print calculator("2+3.01/97")
2.03103092784

>> c1 = 101.2
>> c2 = 0.3
>> print calculator("2*a+34.1*b", a=c1, b=c2)
212.63

This makes it easy to allow people to specify arithmetic expressions in fields that normally handle plain numbers. And to handle to a certain extent named constants or variables, but only the ones that are explicitly specified by the program.

Note that eval() will fail if the users gives an invalid expression, so one may wish to handle thrown exceptions from the calling function, or do this:

Code: [Select]
def calculator(expression, **context):
   try:
      context.update({"__builtins__": {}})
      return float(eval(expression, context))
   except:
      return

Shadilay my brothers!

The Gorn

  • I absolutely DESPISE improvised sulfur-charcoal-salt peter cannons made out of hollow tree branches filled with diamonds as projectiles.
  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 21652
  • Gorn Classic, user of Gornix
    • View Profile
Re: designtalk: Sanitized Expression Evaluation in Python
« Reply #1 on: June 22, 2017, 08:00:22 AM »
Hmm, interesting.

I only know Python for having written a few data-munging scripts for my own use. One such script I used to scrape and convert the previous message board for this community into MySql so I could import the message and user base into a stand alone forum. And I've written way too many CSV-CSV converters to bridge different databases.

So what you're probably implying is that even legitimate traditional mathematical functions such as log, sin, sqrt, etc. are inaccessible when you do this. If all you need is a calculator function in Python, this is certainly the ticket. Perhaps one could cook up a limited set of math functions that could be used as the replacement function for the symbol table lookup, if that was a concern.

Python has a really smart community around it.

PHP people (web design, yech) are impressed by BS such as the "spaceship operator" which is hardly profound. I watched a Youtube video last night where the guy acted like this thing was the second coming.

The spaceship operator:

Code: [Select]
<=>
Gornix is protected by the GPL. *

* Gorn Public License. Duplication by inferior sentient species prohibited.

Code Refugee

  • Wise Sage
  • *****
  • Posts: 1489
  • To Serve Man
    • View Profile
Re: designtalk: Sanitized Expression Evaluation in Python
« Reply #2 on: June 22, 2017, 08:38:29 AM »
Ha ha!  :laugh:  I was thinking, hm, someone will be along to say "But what about math functions?" I was wondering if I should add a section covering that, but thought, if it comes up I'll address it, so here we go.

In general the user typing sin(3/2*pi) into a field doesn't know the back end program is running in python and from a usability standpoint should not be expected to know that. So expecting a user to type into their field or column math.sin(3.0/2.0*math.pi) isn't anything they'll expect ... or appreciate.

Code: [Select]
import math
print calculator("sin(3.0/2.0*pi)", pi=math.pi, sin=math.sin)

There you go. Now see the advantages of being able to rename stuff with kwargs, and also how this sort of stuff is possible with python's dynamic typing system. Containered data can be numbers, functions, or whatever, and vary from element to element. Passing in functions is no different than anything else. Javascript also allows for this and it is really useful.

We also should address something not covered before. In Python 2, the '/' operator does integer division on integers. That's bad for a user calculator since it's not how any users think about numbers. In Python 3, '/' was changed to always do real number division.

So if we're running this in Python 2, at the top of our program, we add this to upgrade it to Python 3 on this aspect:

Code: [Select]
from __future__ import division
This changes how our own code works as well, but now the user can do what they really want and this works:

Code: [Select]
from __future__ import division
import math
# ...
print calculator("sin(3/2*pi)", pi=math.pi, sin=math.sin)

Alternatively if you have ones you always want to support you can fold them in as you mentioned:

Code: [Select]
import math
def calculator(expression, **context):
   try:
      context.update({"__builtins__": {}, "pi": math.pi, "sin": math.sin, "cos": math.cos, "sqrt": math.sqrt, "log": math.log10, "ln": math.log})
      return float(eval(expression, context))
   except:
      return

The Gorn

  • I absolutely DESPISE improvised sulfur-charcoal-salt peter cannons made out of hollow tree branches filled with diamonds as projectiles.
  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 21652
  • Gorn Classic, user of Gornix
    • View Profile
Re: designtalk: Sanitized Expression Evaluation in Python
« Reply #3 on: June 22, 2017, 08:41:49 AM »
Someone? I give you an entire online world here. I'm not just someone on this board.  >:D

At least give me a gold star for comprehending what you're doing here.  :P My programming skills have become old, wheezy and rusty, though. In general discussions about code are bikeshedding to me, but this "was" an IT board, so I'll pitch in.

Cool Italian 80's disco. "Point - Emerging - Probably - Entering"... PEPE!!!!
Gornix is protected by the GPL. *

* Gorn Public License. Duplication by inferior sentient species prohibited.

Code Refugee

  • Wise Sage
  • *****
  • Posts: 1489
  • To Serve Man
    • View Profile
Re: designtalk: Sanitized Expression Evaluation in Python
« Reply #4 on: June 22, 2017, 08:47:15 AM »
☆☆☆

At the time I had the thought, the 'someone' was a potentiality, identity unknown. The identity is now known, but my comment referred to the past before schroedinger's 🐱 box was opened and the quantum field collapsed.

The Gorn

  • I absolutely DESPISE improvised sulfur-charcoal-salt peter cannons made out of hollow tree branches filled with diamonds as projectiles.
  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 21652
  • Gorn Classic, user of Gornix
    • View Profile
Re: designtalk: Sanitized Expression Evaluation in Python
« Reply #5 on: June 22, 2017, 08:52:28 AM »
Ok, I think I know what you mean...  ???
Gornix is protected by the GPL. *

* Gorn Public License. Duplication by inferior sentient species prohibited.