陈大同

  博客园 :: 首页 :: 博问 :: 闪存 :: 新随笔 :: 联系 :: 订阅 订阅 :: 管理 ::

partV

创建文档反向索引。word -> document
与 前面做的 单词统计类似,这个是单词与文档位置的映射关系。
mapF 文档解析相同,返回信息不同而已。
reduceF 返回归约都的字符串,返回document 切片 转成 string 使用 strings.join 。
 
master@master:~/study/6.824/src/main$ go run ii.go master sequential pg-*.txt
ii.go:9:8: cannot find package "mapreduce" in any of:
/usr/local/go/src/mapreduce (from $GOROOT)
/home/master/go/src/mapreduce (from $GOPATH)

找不到 mapReduce 包,把mapReduce 添加到 path
$ cd 6.824
$ export "GOPATH=$PWD" # go needs $GOPATH to be set to the project's working directory
 

 bash ./test-ii.sh 直接测试程序结果。

 

ii.go

package main

import (
    "os"
    "strings"
    "unicode"
    "strconv"
)
import "fmt"
import "mapreduce"

// The mapping function is called once for each piece of the input.
// In this framework, the key is the name of the file that is being processed,
// and the value is the file's contents. The return value should be a slice of
// key/value pairs, each represented by a mapreduce.KeyValue.
func mapF(document string, value string) (res []mapreduce.KeyValue) {
    // Your code here (Part V).
    f := func(c rune) bool {
        return !unicode.IsLetter(c)
    }
    rst := make([]mapreduce.KeyValue, 0)

    keys := strings.FieldsFunc(value, f)
    for _, key := range keys {
        kv := mapreduce.KeyValue{ Key: key, Value:document}
        rst = append(rst, kv)
    }
    return rst
}

// The reduce function is called once for each key generated by Map, with a
// list of that key's string value (merged across all inputs). The return value
// should be a single output value for that key.
func reduceF(key string, values []string) string {
    // Your code here (Part V).
    vm := make(map[string]string)
    var rst []string
    for _, value := range values {
        if _, ok := vm[value]; !ok {
            rst = append(rst, value)
            vm[value] = value
        }
    }

    vl := strings.Join(rst, ",")
    return strconv.Itoa(len(rst)) + " " + vl
}

// Can be run in 3 ways:
// 1) Sequential (e.g., go run wc.go master sequential x1.txt .. xN.txt)
// 2) Master (e.g., go run wc.go master localhost:7777 x1.txt .. xN.txt)
// 3) Worker (e.g., go run wc.go worker localhost:7777 localhost:7778 &)
func main() {
    if len(os.Args) < 4 {
        fmt.Printf("%s: see usage comments in file\n", os.Args[0])
    } else if os.Args[1] == "master" {
        var mr *mapreduce.Master
        if os.Args[2] == "sequential" {
            mr = mapreduce.Sequential("iiseq", os.Args[3:], 3, mapF, reduceF)
        } else {
            mr = mapreduce.Distributed("iiseq", os.Args[3:], 3, os.Args[2])
        }
        mr.Wait()
    } else {
        mapreduce.RunWorker(os.Args[2], os.Args[3], mapF, reduceF, 100, nil)
    }
}

 上面LAB1 五部分一起测试。

LABI 完成。

 
 
posted on 2019-03-24 22:27  陈大同  阅读(266)  评论(0编辑  收藏  举报