ICPC Japan IISF 2018 Floating-Point Numbers(位运算+阅读理解)
题目描述
In this problem, we consider floating-point number formats, data representation formats to approximate real numbers on computers.
Scientific notation is a method to express a number, frequently used for numbers too large or too small to be written tersely in usual decimal form. In scientific notation, all numbers are written in the form m × 10e. Here, m (called significand) is a number greater than or equal to 1 and less than 10, and e (called exponent) is an integer. For example, a number 13.5 is equal to 1.35×101, so we can express it in scientific notation with significand 1.35 and exponent 1.
As binary number representation is convenient on computers, let's consider binary scientific notation with base two, instead of ten. In binary scientific notation, all numbers are written in the form m × 2e. Since the base is two, m is limited to be less than 2. For example, 13.5 is equal to 1.6875×23, so we can express it in binary scientific notation with significand 1.6875 and exponent 3. The significand 1.6875 is equal to 1 + 1/2 + 1/8 + 1/16, which is 1.10112 in binary notation. Similarly, the exponent 3 can be expressed as 112 in binary notation.
A floating-point number expresses a number in binary scientific notation in finite number of bits. Although the accuracy of the significand and the range of the exponent are limited by the number of bits, we can express numbers in a wide range with reasonably high accuracy.
In this problem, we consider a 64-bit floating-point number format, simplified from one actually used widely, in which only those numbers greater than or equal to 1 can be expressed. Here, the first 12 bits are used for the exponent and the remaining 52 bits for the significand. Let's denote the 64 bits of a floating-point number by b64...b1. With e an unsigned binary integer (b64...b53)2, and with m a binary fraction represented by the remaining 52 bits plus one (1.b52...b1)2, the floating-point number represents the number m × 2e.
We show below the bit string of the representation of 13.5 in the format described above.
In floating-point addition operations, the results have to be approximated by numbers representable in floating-point format. Here, we assume that the approximation is by truncation. When the sum of two floating-point numbers a and b is expressed in binary scientific notation as a + b = m × 2e (1 ≤ m < 2, 0 ≤ e < 212), the result of addition operation on them will be a floating-point number with its first 12 bits representing e as an unsigned integer and the remaining 52 bits representing the first 52 bits of the binary fraction of m.
A disadvantage of this approximation method is that the approximation error accumulates easily. To verify this, let's make an experiment of adding a floating-point number many times, as in the pseudocode shown below. Here, s and a are floating-point numbers, and the results of individual addition are approximated as described above.
s := a
for n times {
s := s + a
}
For a given floating-point number a and a number of repetitions n, compute the bits of the floating-point number s when the above pseudocode finishes.
Scientific notation is a method to express a number, frequently used for numbers too large or too small to be written tersely in usual decimal form. In scientific notation, all numbers are written in the form m × 10e. Here, m (called significand) is a number greater than or equal to 1 and less than 10, and e (called exponent) is an integer. For example, a number 13.5 is equal to 1.35×101, so we can express it in scientific notation with significand 1.35 and exponent 1.
As binary number representation is convenient on computers, let's consider binary scientific notation with base two, instead of ten. In binary scientific notation, all numbers are written in the form m × 2e. Since the base is two, m is limited to be less than 2. For example, 13.5 is equal to 1.6875×23, so we can express it in binary scientific notation with significand 1.6875 and exponent 3. The significand 1.6875 is equal to 1 + 1/2 + 1/8 + 1/16, which is 1.10112 in binary notation. Similarly, the exponent 3 can be expressed as 112 in binary notation.
A floating-point number expresses a number in binary scientific notation in finite number of bits. Although the accuracy of the significand and the range of the exponent are limited by the number of bits, we can express numbers in a wide range with reasonably high accuracy.
In this problem, we consider a 64-bit floating-point number format, simplified from one actually used widely, in which only those numbers greater than or equal to 1 can be expressed. Here, the first 12 bits are used for the exponent and the remaining 52 bits for the significand. Let's denote the 64 bits of a floating-point number by b64...b1. With e an unsigned binary integer (b64...b53)2, and with m a binary fraction represented by the remaining 52 bits plus one (1.b52...b1)2, the floating-point number represents the number m × 2e.
We show below the bit string of the representation of 13.5 in the format described above.
In floating-point addition operations, the results have to be approximated by numbers representable in floating-point format. Here, we assume that the approximation is by truncation. When the sum of two floating-point numbers a and b is expressed in binary scientific notation as a + b = m × 2e (1 ≤ m < 2, 0 ≤ e < 212), the result of addition operation on them will be a floating-point number with its first 12 bits representing e as an unsigned integer and the remaining 52 bits representing the first 52 bits of the binary fraction of m.
A disadvantage of this approximation method is that the approximation error accumulates easily. To verify this, let's make an experiment of adding a floating-point number many times, as in the pseudocode shown below. Here, s and a are floating-point numbers, and the results of individual addition are approximated as described above.
s := a
for n times {
s := s + a
}
For a given floating-point number a and a number of repetitions n, compute the bits of the floating-point number s when the above pseudocode finishes.
输入
The input consists of at most 1000 datasets, each in the following format.
n
b52...b1
n is the number of repetitions. (1 ≤ n ≤ 1018) For each i, bi is either 0 or 1. As for the floating-point number a in the pseudocode, the exponent is 0 and the significand is b52...b1.
The end of the input is indicated by a line containing a zero.
n
b52...b1
n is the number of repetitions. (1 ≤ n ≤ 1018) For each i, bi is either 0 or 1. As for the floating-point number a in the pseudocode, the exponent is 0 and the significand is b52...b1.
The end of the input is indicated by a line containing a zero.
输出
For each dataset, the 64 bits of the floating-point number s after finishing the pseudocode should be output as a sequence of 64 digits, each being 0 or 1 in one line.
样例输入
1
0000000000000000000000000000000000000000000000000000
2
0000000000000000000000000000000000000000000000000000
3
0000000000000000000000000000000000000000000000000000
4
0000000000000000000000000000000000000000000000000000
7
1101000000000000000000000000000000000000000000000000
100
1100011010100001100111100101000111001001111100101011
123456789
1010101010101010101010101010101010101010101010101010
1000000000000000000
1111111111111111111111111111111111111111111111111111
0
样例输出
0000000000010000000000000000000000000000000000000000000000000000
0000000000011000000000000000000000000000000000000000000000000000
0000000000100000000000000000000000000000000000000000000000000000
0000000000100100000000000000000000000000000000000000000000000000
0000000000111101000000000000000000000000000000000000000000000000
0000000001110110011010111011100001101110110010001001010101111111
0000000110111000100001110101011001000111100001010011110101011000
0000001101010000000000000000000000000000000000000000000000000000
题意:一个浮点数可表示为m*2^e,用一个64位二进制数记录这个数,前12位表示无符号整数e,后52位表示无符号整数m。
输入数据,n表示需要运算的次数,52位二进制数表示该浮点数的小数部分,整数部分均为1。
输出要求:以64位二进制输出运算结果。
在运算的过程中会发生数据溢出,根据浮点数的运算规则,舍弃低位保留高位。
以十进制为例:限制仅能使用两位数,对13进行连续相加,依次得到
13 26 39 ..... 91
当运算到91的时候,问题出现了,再加一次,会得到104,超出了两位的限制,此时要舍弃低位,将104表示为10*10^1。同时也要对13进行处理,因为在10*10^1中10的0虽在个位,却乘上了10^1,表示的是十位的值,所以13要舍弃低位,变为1。
继续相加则为 11*10^1 12*10^1.....
题中二进制也是这样运算。
因为只有在超出52位表示限制时才会产生舍弃操作,所以我们可以每次都直接加到溢出,然后舍弃低位。这样解决一个数据的时间复杂度为O(logN).
#include "bits/stdc++.h" using namespace std; const int maxn = 1e3 + 10; const int mod = 1e9 + 7; const int inf = 0x3f3f3f3f; #define ll long long #define ull unsigned long long string s; int main() { //freopen("input.txt", "r", stdin); //freopen("output.txt","w",stdout); ll n; while (~scanf("%lld", &n)) { if (n == 0) break; cin >> s; s = "1" + s; ll nn = 0; for (int i = 52; i >= 0; i--) {//将字符串转换为对应的整型数据 if (s[i] == '1') { nn += (1ll << (52 - i)); } } ll e = 0; ll m = nn; int cnt = 0; while (n) { // cout << ++cnt << endl; ll Time = ((1ll << 53) - m) / nn; if (((1ll << 53) - m) % nn == 0) Time--;//Time+1表示再加多少次会爆掉53位,产生精度损失。 // cout << Time << endl; if (Time >= n) { // cout << "no" << endl; m += nn * n; break; } else { // cout << "yes" << endl; m += nn * (Time + 1); e++; m >>= 1; nn >>= 1; n -= (Time+1); if (nn == 0) break; } } for (int i = 11; i >= 0; i--) { if (e & (1ll << i)) cout << 1; else cout << 0; } for (int i = 51; i >= 0; i--) { if (m & (1ll << i)) cout << 1; else cout << 0; } cout<<endl; } return 0; }