访问表和表中字段

一旦有了 SQLite 数据库，我们不仅可以访问存储于表中的数据，还可以访问一些元数
据，倒如，所有表的名字或某个表的列。
连接到之前创建的 SQLite 数据库，以便演示：
con <- dbConnect(SQLite( ), "data/datasets.sqlite")
调用 dbExistsTable( ) 函数，检查数据库中是否存在某张表：
dbExistsTable(con, "diamonds")
## [1] TRUE
dbExistsTable(con, "mtcars")
## [1] FALSE
到目前为止，我们只在 datasets.sqlite 数据库中写入了 diamonds 和
flights 这两张表，所以 dbExistTable( ) 返回的值是正确的。与检测表的存在相对
应，我们用 dbListTables( ) 列出数据库中所有的表：
dbListTables(con)
## [1] "diamonds" "flights"
对于某一张表，我们也可以调用 dbListFields( ) 列出这张表的列名（或字段）：
dbListFields(con, "diamonds")
## [1] "carat" "cut" "color" "clarity" "depth"
## [6] "table" "price" "x" "y" "z"
与 dbWriteTable( ) 相反，dbReadTable( ) 将整张表读入一个数据框（从数
据库中读取数据）：
db_diamonds <- dbReadTable(con, "diamonds")
dbDisconnect(con)
## [1] TRUE
这里就有两个版本的数据了，一个是从数据库中读取的数据框，即 db_diamonds，
另一个是原有版本，即 diamonds。我们可以将二者做一个对比：
head(db_diamonds, 3)
## carat cut color clarity depth table price x y z
## 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
## 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31
## 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
head(diamonds, 3)
## carat cut color clarity depth table price x y z
## 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
## 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31
## 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
两个数据框的数据看起来完全一样。然而，如果调用 identical( ) 进行比较，就
会发现它们其实并不相同：
identical(diamonds, db_diamonds)
## [1] FALSE
为了揭示它们的不同之处，我们调用 str( )查看两个数据框的结构。
这是数据库中表的结构：
str(db_diamonds)
## 'data.frame': 53940 obs. of 10 variables:
## $ carat : num 0.23 0.21 0.23 0.29 0.31 0.24 0.24...
## $ cut : chr "Ideal" "Premium" "Good" "Premium" ...
## $ color : chr "E" "E" "E" "I" ...
## $ clarity: chr "SI2" "SI1" "VS1" "VS2" ...
## $ depth : num 61.5 59.8 56.9 62.4 63.3 62.8 62.3...
## $ table : num 55 61 65 58 58 57 57 55 61 61 ...
## $ price : int 326 326 327 334 335 336 336 337 337...
## $ x : num 3.95 3.89 4.05 4.2 4.34 3.94 3.95...
## $ y : num 3.98 3.84 4.07 4.23 4.35 3.96 3.98...
## $ z : num 2.43 2.31 2.31 2.63 2.75 2.48 2.47...
这是原始版本的结构：
str(diamonds)
## Classes 'tbl_df', 'tbl' and 'data.frame': 53940 obs. of 10
variables:
## $ carat : num 0.23 0.21 0.23 0.29 0.31 0.24 0.24...
## $ cut : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3
...
## $ color : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 2 6 7 7 6 5 2
5 ...
## $ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3
4 5 ...
## $ depth : num 61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
## $ table : num 55 61 65 58 58 57 57 55 61 61 ...
## $ price : int 326 326 327 334 335 336 336 337 337 338 ...
## $ x : num 3.95 3.89 4.05 4.2 4.34 3.94 3.95...
## $ y : num 3.98 3.84 4.07 4.23 4.35 3.96 3.98...
## $ z : num 2.43 2.31 2.31 2.63 2.75 2.48 2.47...
现在，差异就显而易见了。在原有版本中，cut、color 和 clarity 是有序因子变
量，其本质是包含元数据（次序水平）的整数。与之相比，在数据库版本中，这些列被存
储为文本格式。产生这个变动的原因在于 SQLite 中对有序因子没有内置支持。因此，除了
通用数据类型（如数值型、文本型、逻辑型等），在向数据库中插入数据框之前，R 特有的
类型会被转化为 SQLite 支持的类型。

posted @ 2019-02-11 11:16 NAVYSUMMER 阅读(149) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

NAVYSUMMER

访问表和表中字段

公告