CDP Data Developer Exam ( CDP-3001 )
The exam tests the skills and knowledge required by Data Developers to use the Cloudera Data Platform to design, build and maintain data applications and pipelines
- 문제 수 : 79
- 시험시간 : 90분
- 합격 점수 : 미공개…
Topic
- Connect and move data between systems(12Q)
- Build and manage a data warehouse(9Q)
- Build, schedule, execute, and monitor data pipelines(10Q)
- Clean and serve data to the end-users(16 Q)
- Perform data quality checks(7Q)
- Debug data issues reported by end-users(4Q)
- Data backup and disaster recovery(7Q)
Replication Manager는? Replication Manager로 복사 가능한 대상은? Hive Data를 이관, hdfs user 사용, 필요한 권한은? RM > HIVE, HDFS ,IMPALA, RM S3 지원 안함. SSE-KMS …? RM backed by kudu 지원 안함 RM sec > sec / insec > insec / insec > sec 가능 RM multi cluster / source sec or insec all RM Cloud storage / Amanazon s3, MS Azure ALS gen1, Gen2 RM unsupported HDP > CDP7.x / kerberos enabled, sec > insec / hive table managed to managed (managed > external) / Ranger replication이 아니라 migration이 따로 존재 / Knox 있으면 RM 불가
Hbase의 replication : Hbase shell 이용
HDFS to HDFS replication : increase the heap size in hadoop-env.sh / add the key-value pair HADOOP_CLIENT_OPTS=-Xmx
??? HADOOP_CLIENT_OPTS
Remote RM » destination service는 CM이 관리 / Source는 CM이 같거나, 동료(peer)여야 함 / 다른 source, destination의 HDFS data replicate 가능(remote RM) /
HDFS replication > distcp시 추가된 것 카피 x / 동작 중 파일 지우면 에러 / 파일 열려있으면 에러 / 에러가 나도 진행되게 설정 가능
HDFS Merge
hdfs dfs -getmerge -nl Employee MergedEmployee.txt hdfs dfs -getmerge -nl [merger할 Directory] [merge output] hdfs dfs -chmod 664 Employee/MergedEmployee.txt
sqoop 명령어