.NET for Apache Spark CentOS linux安装.Net Core

sudo rpm -Uvh https://packages.microsoft.com/config/rhel/7/packages-microsoft-prod.rpm

sudo yum update
sudo yum install dotnet-sdk-2.2

dotnet

解压 .NET for Apache Spark包然后配置环境变量
export DOTNET_WORKER_DIR="~/bin/Microsoft.Spark.Worker-0.4.0"


dotnet new console -o mySparkApp
cd mySparkApp
dotnet add package Microsoft.Spark --version 0.4.0

新建一个 input.txt 内容如下
Hello World
This .NET app uses .NET for Apache Spark
This .NET app counts words with Apache Spark


修改Program.cs 内容

using Microsoft.Spark.Sql;
namespace MySparkApp
{
class Program
{
static void Main(string[] args)
{
// Create a Spark session
var spark = SparkSession
.Builder()
.AppName("word_count_sample")
.GetOrCreate();

// Create initial DataFrame
DataFrame dataFrame = spark.Read().Text("input.txt");

// Count words
var words = dataFrame
.Select(Functions.Split(Functions.Col("value"), " ").Alias("words"))
.Select(Functions.Explode(Functions.Col("words"))
.Alias("word"))
.GroupBy("word")
.Count()
.OrderBy(Functions.Col("count").Desc());

// Show results
words.Show();
}
}
}


编译这个程序
dotnet build
运行这个成勋
spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner --master local bin/Debug/netcoreapp2.2/microsoft-spark-2.4.x-0.4.0.jar dotnet bin/Debug/netcoreapp2.2/mySparkApp.dll



  
  

 

posted @ 2019-09-19 12:22  农村手艺人  阅读(289)  评论(0编辑  收藏  举报