1、下载scala sdk
http://www.scala-lang.org/download/ 直接到这里下载sdk,(https://downloads.lightbend.com/scala/2.12.8/scala-2.12.8.msi)
2、下载scala for intellij idea的插件
File->setting->plugins里搜索Scala,然后安装即可
3、https://maven.apache.org/download.cgi
http://mirrors.shu.edu.cn/apache/maven/maven-3/3.6.0/binaries/apache-maven-3.6.0-bin.zip
4、生成工程
mvn archetype:generate -DarchetypeGroupId=org.apache.flink -DarchetypeArtifactId=flink-quickstart-scala
或者
mvn archetype:generate -DarchetypeGroupId=org.apache.flink -DarchetypeArtifactId=flink-quickstart-java -DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/ -DarchetypeVersion=1.7-SNAPSHOT
5、scala统计词频示例
package com.test.s import org.apache.flink.api.scala._ object WordCount { def main(args: Array[String]) { val env = ExecutionEnvironment.getExecutionEnvironment // get input data val text = env.readTextFile("D:\\git\\test\\pom.xml") val counts = text.flatMap { _.toLowerCase.split("\\W+") filter { _.nonEmpty } } .map { (_, 1) } .groupBy(0) .sum(1) // counts.writeAsCsv("D:\\git\\test\\output.txt", "\n", " ") counts.print() env.execute("Socket Window WordCount") } }
- 直接按照样例执行,可能出现以下错误:
Exception in thread "main" java.lang.RuntimeException: No new data sinks have been defined since the last execution. The last execution refers to the latest call to 'execute()', 'count()', 'collect()', or 'print()'.
- 参照此文,原因是print()方法自动会调用execute()方法,造成错误,所以注释掉
env.execute()
即可