Ruby 线程(一)
本章包括:
· Creating and Manipulating Threads
· Synchronizing Threads
· Summary
He draweth out the thread of his verbosity finer than the staple of his argument.
William Shakespeare, Love's Labours Lost, act 5, scene 1
Threads are sometimes called lightweight processes. They are nothing more than a way to achieve concurrency without all the overhead of switching tasks at the operating system level. (Of course, the computing community isn't in perfect agreement about the definition of threads; but we won't go into that.)
Ruby的线程是用户级线程并依赖于操作系统。它们工作在DOS和Unix上。它们会完成一定的任务,然而,它也会随操作系统变化。
例如,彼此独立的代码片断被分离的情况下,线程很有用。当一个应用程序花太多的时间等待一个事件时,线程是有用的。通常,一个线程是等待时,另一个线程可能再做更有用的处理。
换句话说,使用线程有潜在的优势。速度降低的相反是好处。同样,在连续访问资源的情况下,线程是没有帮助的。最后,当同步访问全局资源比其它更重要时,使用线程。
基于这些和其它原因,有些作者宣称,使用线程的程序是可避免的。真正的同步代码可能很复杂,倾向于错误,和调试困难。但我们将给你保留它,以在适当的时间使用这些技术。
The difficulties associated with unsynchronized threads are well-known. Global data can be corrupted by threads attempting simultaneous access to those data. Race conditions can occur wherein one thread makes some assumption about what another has done already; these commonly result in "non-deterministic" code that might run differently with each execution. Finally, there is the danger of deadlock, wherein no thread can continue because it is waiting for a resource held by some other thread that is also blocked. Code written to avoid these problems is referred to as thread-safe code.
Not all of Ruby is thread-safe, but synchronization methods are available that will enable you to control access to variables and resources, protect critical sections of code, and avoid deadlock. We will deal with these techniques in this chapter and give code fragments to illustrate them.
一、创建和管理线程
线程的大多数基本操作包括创建线程,向内和向外传递信息,停止线程,等等。我们也可以得到一个线程的列表,检查线程的状态,检查其它各种信息。我们在这儿给出基本操作的浏览。
创建线程很容易。我们调用new方法并附加一个将做为线程体的块。
thread = Thread.new do
# Statements comprising
# the thread...
end
很明显返回值是Tread类型的对象,它被用于主线程来控制它创建的线程。
如果我们想传递参数给线程呢?我们可通过传递参数给Thread.new来做,它将它们传递给块。
a = 4
b = 5
c = 6
thread2 = Thread.new(a,b,c) do |a, x, y|
# Manipulate a, x, and y as needed.
end
# Note that if a is changed in the new thread, it will
# change suddenly and without warning in the main
# thread.
对任何其它的块参数是类似的,与现在变量相符的任何一个,将被有效地传给与之相同变量。前面片断中的变量a在这种情况下是个危险的变量,就像注释指出的。
线程也可在它们被创建的作用域内访问变量。很明显,不是同步的,这可能是个问题。主线程和一个或多个其它线程可能彼此独立地修改这个变量,结果可能是不可预测的。
x = 1
y = 2
thread3 = Thread.new do
# This thread can manipulate x and y from the outer scope,
# but this is not always safe.
sleep(rand(0)) # Sleep a random fraction of a second.
x = 3
end
sleep(rand(0))
puts x
# Running this code repeatedly, x may be 1 or 3 when it
# is printed here!
方法fork是new的别名;这个名字来自于著名Unix系统下的同名调用。
2、Accessing Thread-local Variables
我们知道一个线程使用它作用域外的变量是危险的;我们也知道线程有它自己的局部数据。但是线程如何能让它自己的数据成为公共的呢?
有用于这处目的的一个特殊机制。如果一个线程对象被看成一个哈希表,线程本地数据可以被从那个线程对象的作用域的任何地方访问。我们不是说真正的局部数据可以用这种方式来访问, but only that we have access to named data on a per-thread basis.
还有个方法叫key?,它将告诉我们这个线程内是否使用了给定的名字。
在线程内,我们必须使用像哈希表式的方式来引用数据。使用Threa.current将使做些来容易一些。
thread = Thread.new do
t = Thread.current
t[:var1] = "This is a string"
t[:var2] = 365
end
# Access the thread-local data from outside...
x = thread[:var1] # "This is a string"
y = thread[:var2] # 365
has_var2 = thread.key?("var2") # true
has_var3 = thread.key?("var3") # false
注意这些数据可从其它线程访问,即使是在拥有它们的线程已经死掉以后(像这个例子)。
除了符号之外(像你刚看到的),我们也可以使用字符串来标识线程的本地变量。
thread = Thread.new do
t = Thread.current
t["var3"] = 25
t[:var4] = "foobar"
end
a = thread[:var3] = 25
b = thread["var4"] = "foobar"
不要将这些特殊名字与真正的局部变量搞混了。
thread = Thread.new do
t = Thread.current
t["var3"] = 25
t[:var4] = "foobar"
var3 = 99 # True local variables (not
var4 = "zorch" # accessible from outside)
end
a = thread[:var3] # 25
b = thread["var4"] # "foobar"
最后,注意一个对真正局部变量的对象引用可以被用做线程内的速记形式的缩写。这是真实的,只要你小心保护同名对象引用而不是创建一个新的。
thread = Thread.new do
t = Thread.current
x = "nXxeQPdMdxiBAxh"
t[:my_message] = x
x.reverse!
x.delete! "x"
x.gsub!(/[A-Z]/,"")
# On the other hand, assignment would create a new
# object and make this shorthand useless...
end
a = thread[:my_message] # "hidden"
同样,这个缩写很明显不会工作,在你处理如Fixnums这样的值时,是因它被存储为直接值而不是对象引用。
3、Querying and Changing Thread Status
Thread类有几个服务于各种目的的类方法。List方法返回所有活动线程的列表,main方法返回可产出其它线程的主线程的引用,current方法允许线程找出它自己的标识。
t1 = Thread.new { sleep 100 }
t2 = Thread.new do
if Thread.current == Thread.main
puts "This is the main thread." # Does NOT print
end
1.upto(1000)
sleep 0.1
end
end
count = Thread.list.size # 3
if Thread.list.include?(Thread.main)
puts "Main thread is alive." # Always prints!
end
if Thread.current == Thread.main
puts "I'm the main thread." # Prints here...
exit, pass, start, stop, 和 kill方法被用于控制线程的运行(通常从内部或外部)。
# In the main thread...
Thread.kill(t1) # Kill this thread now
Thread.pass(t2) # Pass execution to t2 now
t3 = Thread.new do
sleep 20
Thread.exit # Exit the thread
puts "Can't happen!" # Never reached
end
Thread.kill(t2) # Now kill t2
# Now exit the main thread (killing any others)
Thread.exit
注意没有实例方法stop,所以一个线程可以停止自己但不能停止其它线程。
还有各种方法用于检查线程的状态。实例方法alive?将说出线程是否还活着(不是存在),stop?将说出线程是否是被停止状态。
count = 0
t1 = Thread.new { loop { count += 1 } }
t2 = Thread.new { Thread.stop }
sleep 1
flags = [t1.alive?, # true
t1.stop?, # false
t2.alive?, # true
线程的状况可以使用status方法看到。如果线程当前在运行中,返回值是"run";如果它被停止,睡眠,或在I/O上等待则是"sleep";如果被异常中止了则是nil。
t1 = Thread.new { loop {} }
t2 = Thread.new { sleep 5 }
t3 = Thread.new { Thread.stop }
t4 = Thread.new { Thread.exit }
t5 = Thread.new { raise "exception" }
s1 = t1.status # "run"
s2 = t2.status # "sleep"
s3 = t3.status # "sleep"
s4 = t4.status # false
全局问题$SAFE可在不同的线程内被不同地设置。在这种情况下,它根本不是真正的全局变量;但我们不应该抱怨,因为这允许我们用不同的安全级别来运行线程。Saft_level方法将告诉我们运行中的线程的级别是什么。
t1 = Thread.new { $SAFE = 1; sleep 5 }
t2 = Thread.new { $SAFE = 3; sleep 5 }
sleep 1
lev0 = Thread.main.safe_level # 0
lev1 = t1.safe_level # 1
lev2 = t2.safe_level # 3
线程的优先级可能使用priority存取器来检查和更改。
t1 = Thread.new { loop { sleep 1 } }
t2 = Thread.new { loop { sleep 1 } }
t2.priority = 3 # Set t2 at priority 3
p1 = t1.priority # 0
p2 = t2.priority # 3
当线程想让出控制给调度程序时,可使用特殊方法pass。而线程只是让出它的时间片;它不是真正地停止或睡眠。
t1 = Thread.new do
puts "alpha"
Thread.pass
puts "beta"
end
t2 = Thread.new do
puts "gamma"
puts "delta"
end
t1.join
t2.join
在这个人为的例子中,当Thread.pass被调用时,我们得到的输出次序是alpha gamma delta beta。没有它,我们得到的次序是alpha beta gamma delta。当然,这个特点不应该被用于同步,但只用于节约时间片的分派。
被停止的线程可以使用run或wakeup方法来唤醒:
t1 = Thread.new do
Thread.stop
puts "There is an emerald here."
end
t2 = Thread.new do
Thread.stop
puts "You're at Y2."
end
sleep 1
t1.wakeup
t2.run
它们区别很微小。wakeup调用修改线程的状态以便于它成为可运行的,但不调度它来运行;换句话说,run唤醒线程并且调度它立即运行。
在特殊情况下,结果是t1在t2之前被唤醒,但t2首先得到调度,产生的输出如下:
You're at Y2.
There is an emerald here.
raise实例方法在指定做为接收者的线程内引起一个异常。(这个调用不是必须在线程内发生的。)
factorial1000 = Thread.new do
begin
prod = 1
1.upto(1000) { |n| prod *= n }
puts "1000! = #{ prod} "
rescue
# Do nothing...
end
end
sleep 0.01 # Your mileage may vary.
if factorial1000.alive?
factorial1000.raise("Stop!")
puts "Calculation was interrupted!"
else
puts "Calculation was successful."
end
先前生成的线程试图计算1000的阶乘;如果没有在100秒内完成,主线程将杀死它。所以,在相对缓慢的机器上,这个代码片断打印的信息将是Calculation was interrupted!。关于线程内的rescue子句,很明显我们可放置我们需要的任何代码,就像对其它子句那样。
4、Achieving a Rendezvous (and Capturing a Return Value)
有时候,主线程希望等待另一个线程的完成。实例方法join将完成这些。
t1 = Thread.new { do_something_long() }
do_something_brief()
t1.join # Wait for t1
注意如果主线程在另一个线程上等待,则join是必须的。否则,当主线程退出时,线程被杀死。例如,这个片断在在尾部没有join时,从不会给我们它的最后回答:
meaning_of_life = Thread.new do
puts "The answer is..."
sleep 10
puts 42
end
sleep 9
meaning_of_life.join
这儿有个有用的小语法。它将在每个活动的线程上除了主线程,调用join。(对任何线程即使是主线程,这有个错误,在它的自身上调用join)。
Thread.list.each { |t| t.join if t != Thread.main }
当然,在两者都不是线程时,一个线程在另一个线程上做一个join是可行的。如果主线程和另一个线程试图彼此来join,结果是死锁;解释器会察觉这种情况并退出程序。
thr = Thread.new { sleep 1; Thread.main.join }
thr.join # Deadlock results!
线程有关联块,并且块可以返回值。这意味着线程可以返回值。value方法将暗中做join操作并等待线程的完成;然后,它返回线程内最后的计算表达式返回的值。
max = 10000
thr = Thread.new do
sum = 0
1.upto(max) { |i| sum += i }
sum
end
guess = (max*(max+1))/2
print "Formula is "
if guess == thr.value
puts "right."
else
puts "right."
5、Dealing with Exceptions
如果在线程内出现异常会发生什么事?它被关掉,行为是可配置的。