大数据开发目录
spark:
1.分布式计算
2.Spark组件简介
3.RDD(一)
4.RDD(二)
5.spark-submit
6.Schedulers
7.Spark SQL初识
8.spark sql
9.Spark Streaming集成
10.HBase
scala:
1.集合
2.基础
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.11.8</version>
</dependency>
</dependencies>
<build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.1</version> <configuration> <source>1.8</source> <target>1.8</target> <testExcludes> <testExclude>/src/test/**</testExclude> </testExcludes> <encoding>utf-8</encoding> </configuration> </plugin> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <version>3.1.6</version> <executions> <execution> <goals> <goal>compile</goal> </goals> </execution> </executions> </plugin> <plugin> <groupId>org.apache.avro</groupId> <artifactId>avro-maven-plugin</artifactId> <version>1.7.7</version> <executions> <execution> <phase>generate-sources</phase> <goals> <goal>schema</goal> </goals> <configuration> <sourceDirectory>${project.basedir}/src/main/avro</sourceDirectory> <outputDirectory>${project.basedir}/src/main/java</outputDirectory> </configuration> </execution> </executions> </plugin> </plugins> </build>
1.数据库中怎么将行转成列
行转列: 以某一列的值,作为新的列,另外一列作为值,比如下面科目就是新的列,而分数就是对应列值,等把之前同一个人的多条成绩,以列的方式合并到同一行,这就是行转列:大概思路-- 就是先每个人分组,组内按照科目摘出对应的成绩,也就是使用case when
SELECT user_name ,
MAX(CASE course WHEN '数学' THEN score ELSE 0 END ) 数学,
MAX(CASE course WHEN '语文' THEN score ELSE 0 END ) 语文,
MAX(CASE course WHEN '英语' THEN score ELSE 0 END ) 英语
FROM test_tb_grade
GROUP BY USER_NAME;
列转行:
把一行的多个列,拆分到多行,大概思路--单独查询每列(这里就是科目的值),然后union
select user_name, '语文' COURSE , CN_SCORE as SCORE from test_tb_grade2
union select user_name, '数学' COURSE, MATH_SCORE as SCORE from test_tb_grade2
union select user_name, '英语' COURSE, EN_SCORE as SCORE from test_tb_grade2
order by user_name,COURSE;