CDP Data Developer Exam ( CDP-3001 )

The exam tests the skills and knowledge required by Data Developers to use the Cloudera Data Platform to design, build and maintain data applications and pipelines

  • 문제 수 : 79
  • 시험시간 : 90분
  • 합격 점수 : 미공개…

Topic

  • Connect and move data between systems(12Q)
  • Build and manage a data warehouse(9Q)
  • Build, schedule, execute, and monitor data pipelines(10Q)
  • Clean and serve data to the end-users(16 Q)
  • Perform data quality checks(7Q)
  • Debug data issues reported by end-users(4Q)
  • Data backup and disaster recovery(7Q)

Replication Manager는? Replication Manager로 복사 가능한 대상은? Hive Data를 이관, hdfs user 사용, 필요한 권한은? RM > HIVE, HDFS ,IMPALA, RM S3 지원 안함. SSE-KMS …? RM backed by kudu 지원 안함 RM sec > sec / insec > insec / insec > sec 가능 RM multi cluster / source sec or insec all RM Cloud storage / Amanazon s3, MS Azure ALS gen1, Gen2 RM unsupported HDP > CDP7.x / kerberos enabled, sec > insec / hive table managed to managed (managed > external) / Ranger replication이 아니라 migration이 따로 존재 / Knox 있으면 RM 불가

Hbase의 replication : Hbase shell 이용 HDFS to HDFS replication : increase the heap size in hadoop-env.sh / add the key-value pair HADOOP_CLIENT_OPTS=-Xmx

??? HADOOP_CLIENT_OPTS

Remote RM » destination service는 CM이 관리 / Source는 CM이 같거나, 동료(peer)여야 함 / 다른 source, destination의 HDFS data replicate 가능(remote RM) /

HDFS replication > distcp시 추가된 것 카피 x / 동작 중 파일 지우면 에러 / 파일 열려있으면 에러 / 에러가 나도 진행되게 설정 가능

HDFS Merge

hdfs dfs -getmerge -nl Employee MergedEmployee.txt hdfs dfs -getmerge -nl [merger할 Directory] [merge output] hdfs dfs -chmod 664 Employee/MergedEmployee.txt

sqoop 명령어