Abstract
Unlock business rule potential with a Python sandbox! Isolate, secure, and optimize performance. Forget Docker complexities. Choose thread/process wisely. Employ AST, Linux security, ulimit, Time Guard. Speed up with dependency preloading. Reuse the sandbox, reduce IO. Elevate your rule execution game!
Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone. Thank you for joining today.
My name is I'm thrilled to have the
opportunity to share with you some exciting insights
and practices in the world of python. Today,
I'm going to introduce the topic of building a Python
sandbox for dynamic rule execution.
Our session will uncover not just the theoretical
underpinnings, but also practical technologies and
codes. So let's get started.
Before we dive into today's topic, I'd like
to give you a brief overview of our agenda
so you will know what to expect from our session.
We will begin by defining the case concepts behind
the rule execution and sandboxing.
Understanding these fundamentals is crucial for the subsequent content.
We will discuss the advantage and potential drawbacks
between Docker virtual machines and customized
solutions. This will help us
know why we implement epython sandbox
over using AsiN solutions. Then we'll
transition into practices. This section
is divided into three crucial components,
isolation, security and performance.
Let's talk about the context first.
Imagine world where decisions are made efficiently and
constantly. Let's assume we enter this
rule engine as they heard of this engine.
They are the business logics that dictate how
our system behaves in various scenarios.
As a powerful and easy to use language,
Python is one of the best options to describe basis
logics in a rule engine.
Moving on to the rule engine itself,
it ensures that all the rules are followed to the latter
executed in Python interpreter.
We can assume that without isolation,
rules for different purpose may influence each
other, which cause unexpected behaviors
without control, capable of actions that
may go beyond our intentions, potentially affecting
the engine's civility and security.
As we start a journey to create a sandbox, we must address the
elephant in the room. Why not using established
solutions like virtual machines or docker virtual
machines? Unlike separate entities,
each with its own operating system running
isolation on top of the host OS,
they offer a high level of security due to this
isolation, but at the cost of performance
and resource consumption. On the other hand,
Docker has revolutionized the concept of
containerization. Containers are more
lightweight compared to VMS sharing the host
OS kernel, allowing for processes
and execution isolation.
This makes Docker an attractive option for many
scenarios. When it comes to our root
engine, the balance of performance and resource consumption
takes center stage. Our root engine needs to
execute numerous small tasks at a high
frequency. The startup time of a
virtual machine or even a docker container can
introduce huge latency that's unacceptable
for our use case. Moreover,
the resource overhead, well, smaller with Docker,
is still significant when we talk about the
scale at which our engine operates.
This brings us to our customized solution.
They have crafted a sandbox that is tailored
to the unique demands of our engine.
Our sandbox is designed to be isolated,
secure, and with high performance.
Isolation is the first pillar in our sandbox,
restricting access to hardware resources,
ensuring that the sandbox processes cannot
use excessive resources,
unexpectable network behaviors, or unsorized
operations. But that's on top.
Our sandbox offers extensive customization
options. They understand that one
size doesn't fit all, and so resource limits
and security policies can be tailored to fit
the specific rule for different business scenarios.
Moving to the next pillar security our
prime directive is to prevent
the execution of malicious code. The sandbox
is designed to detect threats before they can cause harm.
The block list mechanism is in place
to control usage over python modules and
functions. We use arrow handling to ensure
that unexpected code behaviors do
not escalate into system crashes.
Finally, we arrive at performance pillar.
Our sandbox is engineered to handle
massive requests with ease. It's built
to withstand the high surplus demands of our engine.
Low latency is the highlight of our solution.
When talking about isolation, we have to
satisfy three key points. Your first is limiting
resource usage. As you can see through the screenshot on
slides, we use the building Python
package resource to achieve this goal.
The underlying of resource is the
setter ulimit API in Linux kernel,
which can be used to specify particular system resources
and to request usage information about either
the current process or its children.
As you can see, with resource package we
can easily control the cpu time, memory, usage stack,
even the fail system quota.
It almost prevents all possible effects to
system that can be caused by
user defined rules. Moreover, the limitation
can be applied and changed in real time.
This means that each rule execution can be functioned
based on its specific needs and the context
of its invocation. Our sandbox
can dynamically adjust resource allocations and
security mirrors for different cases,
processing the same or in the different routes.
This adaptive approach allows us to maximize resource
utilization efficiency while maintaining strict
security policies. Our sandbox
implements strict controls on the execution time
of each process. By monitoring and
managing running time, we prevent logics
like dead loop and ensure that
all operations complete within their allotted time
frames, maintaining a smooth and predictable
performance.
Let's talk about security to secure
rule execution. Our sandbox that leverage the
power of AIST analysis combined with
a block list to filter out potential malicious code
at both the module and function level.
Resolve usage and system call limitations
are reactive. They come into play
during execution. However, our AIST
based analysis represents a proactive
approach by analyzing the code structure
and its semantics. We can identify
and unlock malicious patents before they are running.
Python provides a building package called AIST,
which provides capabilities to traverse and
analyze grammar tree.
By using this package, we can easily find out which package
are imported to the scope and which functions
are used in user roots.
Combined with the block list, we can block malicious code
without running them. With malicious code prevention,
we can save resources and prevent potential
damage to the system or the network,
which might be irreusible even
with strictest execution control.
In the sandbox environment, we also strike
a balance between transparency for the user
and the stability for the service.
Arrow collection plays pelto roles in this
balance. Let's take a closer
look at how we capture and display arrows from
our user rules while simultaneously
shouting our system from potential crashes.
Well, roots run in our sandbox.
It's isolated from the core of our service.
This means that any arrows within ruse
won't escalate to affect the service itself.
As you can see through the screenshot, when ruin counter arrow,
our sandbox doesn't simply shut down in silence.
Instead, to help user has enough information to
detect which part is wrong, the sandbox will try to
print out trace back information,
then return them to users for further investigation.
Now let's talk about performance. Achieving isolation
in a sandbox environment, especially for high
volume request scenarios,
presents a set of unique
challenges. In the initial design,
we considered the approach of spawning
a new sandbox process for every
incoming request. This provides unparalleled
isolation in ensuring that each
rule is executed in a completely separate
environment. However, this method introduces
significant time consumption for each request
because the Python interpreter needs to import
the necessary packages for each new process,
which leads to disk I O saturation when dealing
with a high number of requests.
We found that with only 20 to
30 requests per second, the sandbox will
exhaust the disk bandwidth and
the time consumption for each request will increase from
few milliseconds to hundreds of milliseconds or
even several seconds. We get
here the disresk when mouse
rules use similar Python package in
an experiment shown on the slice
which reproduces this situation.
As you can see, we use a rule which only imports
package requests when having 16
processes in a machine with HCPO cores,
the execution time increased to 300
milliseconds. Generally, executing a rule only
costs several million seconds, so the huge
time consumption in the preloading stage is obviously
unacceptable.
To tackle this performance issue, we established
a Python interpreter pool for interpreter reuse,
and also we preloaded common libraries upon
interpreter startup so the Python interpreter
will not try to import libraries existing in memory
when executing further rules eliminates
discrete significantly speeding up requests,
handling, and reducing library import time to almost
zero, effectively optimizing system performance
and response time.
In conclusion, our Python sandbox
four dynamic rule execution offers a robust and
secure environment for running user defined rules.
It represents a significant step forward
in both flexibility and curative for rule
execution. We hope this invocation will
empower users to achieve more with our platform.
Thank you for joining this technical session.