That said, if you somehow knew that an application had an insecure classloader setup, *and* you had the ability to send data to it in the serialization format, you could perhaps take advantage of subclassing to inject arbitrary code. This is a bit of a stretch.
Hmm... don't think I was using URLClassLoader, as it didn't exist at the time I messed around with this stuff.
At least, I don't remember it existing back then.
Hrm... Java was first released in 1995. That's about when I finished college, which is about the time I was messing around with this. The 'web' was this new thing introduced by AOL (at least, if you asked the wrong people). I wasn't working over HTTP or HTTPS, but raw TCP/IP. Pretty sure back then I was doing this kind of hacking on a Macintosh of some kind, as I don't think I had an IBM clone of any merit (just an old i386 clone running DOS or something)... something rather old even for those days.
I remember very clearly finding it exciting to get the code to run remotely, though. It was definitely a 'Neato' moment for me. I just don't recall the specifics anymore. I know, at the time, Swing didn't exist, but I think Sun was just starting to work on it.
Still, it might not have been serialization (again, lost the specifics).
Or maybe someone else saw the kind of thing I was doing at Sun and thought, "Oh no... this is very bad..." and pulled it before it gained any real traction.
2018-05-31 08:28 from fleeb @uncnsrd
Hmm... don't think I was using URLClassLoader, as it didn't exist at
the time I messed around with this stuff.
Maybe, maybe not... thing is, if you were deploying applets, there was most definitely something very much like URLClassLoader (whether it was called by that name at the time, or something else) in play
Maybe you're thinking of RMIClassLoader. RMI is tightly integrated with serialization and does what you're talking about, but should be thought of as an application of serialization, or a layer on top of serialization, rather than serialization itself.
Hmm.
Well, if serialization itself isn't actually serializing the methods, that's fine.
OTOH, whether the official 'serialization' library or something created by someone else, people will find a relatively simple way to transfer state from one place to another through something very much like serialization (if by another name). Just getting rid of it altogether probably won't work.
So with the "serialization of methods" thing discarded, my original point stands: Python + JSON serialization doesn't seem to have this problem.
Why?
I'd defer to LS's expertise in Java to identify why it's a problem in Java.
Serialization, where the transfer of methods aren't involved, shouldn't cause issues.
OTOH, maybe if one serializes the data into a collection of objects that themselves have methods (internally), and those methods rely upon the state of their objects to act in certain ways, I could see that being a problem.
I don't think when you serialize JSON into Python objects that the Python objects gain any methods at all... you'd have to go to some additional trouble to add that in, if I have it right.
JavaScript would work in a similar fashion... JSON-serialization happens all the time there, but the objects resulting from such serialization has no methods to them beyond simple things for handling collections. I expect it's the same in Python.
It really depends on whether you're trying to enforce security boundaries within a JVM (e.g. via a SecurityManager) or not.
http://www.oracle.com/technetwork/java/seccodeguide-139067.html#8
Default deserialization bypasses input validation and may allow the creation of objects with illegal state. If your data structures allow for recursion or circular references, it's easy to see how deserialization of untrusted data could at least lead to a denial-of-service.
https://www.contrastsecurity.com/security-influencers/java-serialization-vulnerability-threatens-millions-of-applications
"What's needed is a way to allow deserialization, but make it impossible for attackers to create instances of arbitrary classes."
A typical enterprise Java application may contain tens to hundreds of thousands of classes, many of which could be third-party and not understood by the 1st-party developers. And because the default deserialization process allows the data stream to specify any of those classes, and the full object graph will be instantiated before readObject() returns, that's a pretty big footprint for what you need to audit before you know your process is safe.
Typically, the only thing that enforces that the object graph returned by readObject() is of the class that's expected is the type-cast operation that's typically performed after readObject(). And by the time that cast has executed, it may already be too late: vulnerable objects have already been instantiated and their code may already have been executed, performing actions based on untrusted data.
But in my experience, the vast majority of Java applications do not deserialize untrusted data. So deprecating the feature entirely, to me seems like knee-jerk over-reaction.
If your data structures allow for recursion or circular references, shouldn't the serialization recognize "Oh, these two objects are the same instance... let's preserve that" and ensure that the state is kept stable?
Or... do they not really do that very well?
Deserialization reads the state in successfully, but the attacker created a circular reference where your normal code would not do so. When your application code attempts to traverse the data structure, it gets stuck in an infinite loop.
A typical enterprise Java application may contain tens to hundreds of thousands of classes, many of which could be third-party and not understood by the 1st-party developers. And because the default deserialization process allows the data stream to specify any of those classes, and the full object graph will be instantiated before readObject() returns, that's a pretty big footprint for what you need to audit before you know your process is safe.
Ok, so it isn't possible to serialize the methods themselves, but it's possible to instantiate existing (and possibly arbitrary) methods.
That sounds almost as bad, particularly when a large portion of those methods are from well-known libraries, many of which have published source code so the attacker can try to exploit side effects.
But in my experience, the vast majority of Java applications do not deserialize untrusted data. So deprecating the feature entirely, to me seems like knee-jerk over-reaction.
Bingo.
You do not instantiate methods; you only instantiate objects. The classes
*of those objects* implement the methods. If you're careful and crafty enough,
you can build an object graph which otherwise is inconsistent or otherwise
invalid (perhaps by using a malicious subclass of an otherwise trusted part
of the object graph), with a method definition which does naughty things.
The reason Python has no problems with this (especially with JSON) is that the classes of each node in the object graph is a closed set. You ONLY have a choice of string, integer (more precisely, a number of some kind), list, tuple, dict, or None. Even if you specify an override for dict, it's still a closed set. No amount of custom craftwork is going to convince the JSON decoder to instantiate an object of an arbitrary class.
The reason Python has no problems with this (especially with JSON) is that the classes of each node in the object graph is a closed set. You ONLY have a choice of string, integer (more precisely, a number of some kind), list, tuple, dict, or None. Even if you specify an override for dict, it's still a closed set. No amount of custom craftwork is going to convince the JSON decoder to instantiate an object of an arbitrary class.
That said, I do agree that the API should not be deprecated; that's just stupid.
Fixing the interface by subjecting it to additional security measures would have been the right way to go forward.
Fixing the interface by subjecting it to additional security measures would have been the right way to go forward.
json vs serialization is apples/oranges. you can decode safely in java; safe json decoding is not a builtin feature of the javascript language (doing it safely requries a parser, same as in java)
The reason Python has no problems with this (especially with JSON) is that the classes of each node in the object graph is a closed set. You ONLY have a choice of string, integer (more precisely, a number of some kind), list, tuple, dict, or None. Even if you specify an override for dict, it's still a closed set. No amount of custom craftwork is going to convince the JSON decoder to instantiate an object of an arbitrary class.
So the chorus of wankers on Slashdork all those years ago, who were saying that Python, not Java or C#, should become the lingua franca of business logic, were right?
(My tongue is only slightly in my cheek here. I'll take Python over either Java or Microsoft Slightly Different Java any day of the week.)
Python's OK, but it's basically for obsessive-compulsive people who think that every detail is crucially important and therefore make too much of a big deal out of *whitespace*--grow up, jerks, and leave the rest of us alone.
I never, ever implied that JSON didn't need a parser, even in Javascript.
In fact, one of the earliest parsers for it was in Javascript precisely because of its security implications.
In fact, one of the earliest parsers for it was in Javascript precisely because of its security implications.