再谈随机数引起的阻塞问题
Java的随机数实现有很多坑,记录一下这次使用jdk1.8里新增的加强版随机数实现SecureRandom.getInstanceStrong()
遇到的问题。
之前在维护ali-tomcat的时候曾发现过jvm随机数算法选用不当导致tomcat的SessionID生成非常慢的情况,可以参考JVM上的随机数与熵池策略 和 Docker中apache-tomcat启动慢的问题 这两篇文章。不过当时没有太追究,以为使用了-Djava.security.egd=file:/dev/./urandom
就可以避免了,在这次项目里再次遇到随机数导致所有线程阻塞之后发现这块还挺多规则。
本次项目中使用的是jdk1.8,启动参数里设置了
-Djava.security.egd=file:/dev/./urandom
使用的随机数方式是Java8新增的:
SecureRandom.getInstanceStrong();
碰到故障时,线程阻塞在
"DubboServerHandler-xxx:20880-thread-1789" #28440 daemon prio=5 os_prio=0 tid=0x0000000008ffd000 nid=0x5712 runnable [0x000000004cbd7000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:246)
at sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:410)
at sun.security.provider.NativePRNG$RandomIO.implGenerateSeed(NativePRNG.java:427)
- locked <0x00000000c03a3c90> (a java.lang.Object)
at sun.security.provider.NativePRNG$RandomIO.access$500(NativePRNG.java:329)
at sun.security.provider.NativePRNG$Blocking.engineGenerateSeed(NativePRNG.java:272)
at java.security.SecureRandom.generateSeed(SecureRandom.java:522)
因为这个地方有加锁,locked <0x00000000c03a3c90>
,所以其它线程调用到这里时会等待这个lock
:
"DubboServerHandler-xxx:20880-thread-1790" #28441 daemon prio=5 os_prio=0 tid=0x0000000008fff000 nid=0x5713 waiting for monitor entry [0x000000004ccd8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.security.provider.NativePRNG$RandomIO.implGenerateSeed(NativePRNG.java:424)
- waiting to lock <0x00000000c03a3c90> (a java.lang.Object)
at sun.security.provider.NativePRNG$RandomIO.access$500(NativePRNG.java:329)
at sun.security.provider.NativePRNG$Blocking.engineGenerateSeed(NativePRNG.java:272)
at java.security.SecureRandom.generateSeed(SecureRandom.java:522)
去查 NativePRNG$Blocking
的代码,看到它的文档描述:
A NativePRNG-like class that uses /dev/random for both seed and random material. Note that it does not respect the egd properties, since we have no way of knowing what those qualities are.
奇怪怎么-Djava.security.egd=file:/dev/./urandom
参数没起作用,仍使用/dev/random
作为随机数的熵池,时间久或调用频繁的话熵池很容易不够用而导致阻塞;于是看了一下 SecureRandom.getInstanceStrong()
的文档:
Returns a SecureRandom object that was selected by using the algorithms/providers specified in the securerandom.strongAlgorithms Security property.
原来有自己的算法,在 jre/lib/security/java.security
文件里,默认定义为:
securerandom.strongAlgorithms=NativePRNGBlocking:SUN
如果修改算法值为NativePRNGNonBlocking:SUN
的话,会采用NativePRNG$NonBlocking
里的逻辑,用/dev/urandom
作为熵池,不会遇到阻塞问题。但这个文件是jdk系统文件,修改它或重新指定一个路径都有些麻烦,最好能通过系统环境变量来设置,可这个变量不像securerandom.source
属性可以通过系统环境变量-Djava.security.egd=xxx
来配置,找半天就是没有对应的系统环境变量。只好修改代码,不采用SecureRandom.getInstanceStrong
这个新方法,改成了SecureRandom.getInstance("NativePRNGNonBlocking")
。
对于SecureRandom
的两种算法实现:SHA1PRNG
和 NativePRNG
跟 securerandom.source
变量的关系,找到一篇解释的很清楚的文章:Using the SecureRandom Class
On Linux:
1) when this value is “file:/dev/urandom” then the NativePRNG algorithm is registered by the Sun crypto provider as the default implementation; the NativePRNG algorithm then reads from /dev/urandom for nextBytes but /dev/random for generateSeed
2) when this value is “file:/dev/random” then the NativePRNG algorithm is not registered by the Sun crypto provider, but the SHA1PRNG system uses a NativeSeedGenerator which reads from /dev/random.
3) when this value is anything else then the SHA1PRNG is used with a URLSeedGenerator that reads from that source.
4) when the value is undefined, then SHA1PRNG is used with ThreadedSeedGenerator
5) when the code explicitly asks for “SHA1PRNG” and the value is either “file:/dev/urandom” or “file:/dev/random” then (2) also occurs
6) when the code explicitly asks for “SHA1PRNG” and the value is some other “file:” url, then (3) occurs
7) when the code explicitly asks for “SHA1PRNG” and the value is undefined then (4) occurs
至于SHA1PRNG
算法里,为何用urandom时,不能直接设置为file:/dev/urandom
而要用变通的方式设置为file:///dev/urandom
或者 file:/dev/./urandom
,参考这里:
In SHA1PRNG, there is a SeedGenerator which does various things depending on the configuration.
If java.security.egd or securerandom.source point to “file:/dev/random” or “file:/dev/urandom”, we will use NativeSeedGenerator, which calls super() which calls SeedGenerator.URLSeedGenerator(/dev/random). (A nested class within SeedGenerator.) The only things that changed in this bug was that urandom will also trigger use of this code path.
If those properties point to another URL that exists, we’ll initialize SeedGenerator.URLSeedGenerator(url). This is why “file:///dev/urandom”, “file:/./dev/random”, etc. will work.
http://hongjiang.info/java8-nativeprng-blocking/