最近在研究trino on yarn 功能,網上大部分都是關於presto on yarn文章,關於trino on yarn 資料很少,但是本質上差不多,需要修改一些內容,主要在偵錯方面這個slider不是很方便,分享下實踐過程。
如果Trino叢集沒有彈性擴縮容需求或者已經有很成熟的K8S容器部署方案你可以忽略這個功能,最後實現效果就是通過slider自動部署以及調整trino node節點的數量實現快速的擴容縮容,查詢本身的消耗資源跟yarn沒有很大關係,還是跟設定的trino的叢集資源有關。在叢集資源緊張的情況下,合理調節不同時段資源的分配,比如夜裡查詢請求很少的情況下,可以釋放一部分node節點給Flink Spark去做計算還是很實用的。
編譯apache-slider-0.92.0-incubating
def python_command(self, script, script_params): #we need manually pass python executable on windows because sys.executable will return service wrapper python_binary = os.environ['PYTHON_EXE'] if 'PYTHON_EXE' in os.environ else sys.executable python_command = [python_binary, "-S", script] + script_params #if Python binary location is not found then fall back to generic Python path if not python_binary: logger.warn("Python binary not found in this environment. Using /usr/bin/python") python_binary = "/usr/bin/python" python_command = [python_binary, script] + script_params return python_command
編譯trino-yarn
1.GitHub地址:https://github.com/prestodb/presto-yarn.git
2.修改根目錄pom檔案
3.修改presto-yarn-package pom檔案依賴
1.編譯好了以後把2個檔案拷貝到伺服器上,設定trino-yarn-package appConfig-default.json,resources-default.json,熟悉trino、presto的應該都比較熟悉,附上我的設定參考:
{ "schema": "http://example.org/specification/v2.0.0", "metadata": { }, "global": { "site.global.app_user": "presto", "site.global.user_group": "presto", "site.global.data_dir": "/data/trino/data", "site.global.config_dir": "/data/trino/etc", "site.global.app_name": "trino-server-418", "site.global.app_pkg_plugin": "${AGENT_WORK_ROOT}/app/definition/package/plugins/", "site.global.singlenode": "true", "site.global.coordinator_host": "192.168.2.182", "site.global.presto_query_max_memory": "27GB", "site.global.presto_query_max_memory_per_node": "4GB", "site.global.presto_query_max_total_memory_per_node": "9GB", "site.global.presto_server_port": "8089","site.global.catalog": "{'tpch': ['connector.name=system']}", "site.global.jvm_args": "['-server', '-Xmx50G', '-XX:InitialRAMPercentage=80', '-XX:MaxRAMPercentage=80', '-XX:G1HeapRegionSize=32M', '-XX:+ExplicitGCInvokesConcurrent', '-XX:+ExitOnOutOfMemoryError', '-XX:+HeapDumpOnOutOfMemoryError', '-XX:-OmitStackTraceInFastThrow', '-XX:ReservedCodeCacheSize=512M', '-XX:PerMethodRecompilationCutoff=10000', '-XX:PerBytecodeRecompilationCutoff=10000', '-Djdk.attach.allowAttachSelf=true', '-Djdk.nio.maxCachedBufferSize=2000000', '-XX:+UnlockDiagnosticVMOptions', '-XX:+UseAESCTRIntrinsics', '-XX:+UseG1GC']", "site.global.log_properties": "['io.trino=INFO']", "application.def": ".slider/package/trino/trino-yarn.zip", "java_home": "/home/presto/presto/zulu17.42.21-ca-crac-jdk17.0.7-linux_x64/bin/java" }, "components": { "slider-appmaster": { "jvm.heapsize": "128M" } } }
{ "schema": "http://example.org/specification/v2.0.0", "metadata": { }, "global": { "yarn.vcores": "1" }, "components": { "slider-appmaster": { }, "WORKER": { "yarn.role.priority": "2", "yarn.component.instances": "3", "yarn.component.placement.policy": "1", "yarn.memory": "1500" } } }
詳細參考:https://prestodb.io/presto-yarn/installation-yarn-configuration-options.html#appconfig-json
2.啟動slider
../bin/slider create presto-query --template appConfig-default.json --resources resources-default.json
成功效果圖
詳細的可以參照這個部落格,非常的詳盡:PrestoOnYarn搭建及其問題解決方案總結_presto on yarn_qq_2368521029的部落格-CSDN部落格,(我主要寫我偵錯的內容,這方面的內容比較少)
部署到Yarn 裡面後會遇到很多的問題,但是怎麼偵錯這個還是稍微有點麻煩,我給出我的偵錯方法給大家一個參考。
其實程式本身就是通過動態的分發Presto-yarn包裡的trino-server檔案以及自動生成trino的組態檔,slider是一個通用執行命令的框架。通過紀錄檔我們可以看到實際的工作目錄,以及具體的執行python指令碼命令。
註釋掉slider 這部執行語句,讓程式空跑,指令碼實際並沒有執行。
完成1個Work節點的部署就三步,INSTALL--->START---->STATUS
根據具體列印出來的命令手動切換 AGENT_WORK_ROOT 目錄,然後手動執行指令碼,就能按照實際的報錯進行偵錯,具體引數就是紀錄檔裡面列印出來的拷貝,給出範例:
[root@gpmaster scripts]# export PYTHONPATH=/opt/softinstall/hadoop-3.2.3/data/tmp/nm-local-dir/usercache/root/appcache/application_1692453841261_0117/filecache/10/slider-agent.tar.gz/slider-agent/jinja2:/opt/softinstall/hadoop-3.2.3/data/tmp/nm-local-dir/usercache/root/appcache/application_1692453841261_0117/filecache/10/slider-agent.tar.gz/slider-agent [root@gpmaster scripts]# python presto_worker.py INSTALL /opt/softinstall/hadoop-3.2.3/logs/userlogs/application_1692453841261_0117/container_1692453841261_0117_01_000002/command-1.json /opt/softinstall/hadoop-3.2.3/data/tmp/nm-local-dir/usercache/root/appcache/application_1692453841261_0117/filecache/11/trino-yarn.zip/package /opt/softinstall/hadoop-3.2.3/logs/userlogs/application_1692453841261_0117/container_1692453841261_0117_01_000002/structured-out-1.json INFO /opt/softinstall/hadoop-3.2.3/data/tmp/nm-local-dir/usercache/root/appcache/application_1692453841261_0117/container_1692453841261_0117_01_000002 2023-08-28 17:04:39,453 - Directory['/opt/softinstall/hadoop-3.2.3/data/tmp/nm-local-dir/usercache/root/appcache/application_1692453841261_0117/container_1692453841261_0117_01_000002/app/install'] {'action': ['delete']}
[root@gpmaster scripts]# python presto_worker.py START /opt/softinstall/hadoop-3.2.3/logs/userlogs/application_1692453841261_0117/container_1692453841261_0117_01_000002/command-1.json /opt/softinstall/hadoop-3.2.3/data/tmp/nm-local-dir/usercache/root/appcache/application_1692453841261_0117/filecache/11/trino-yarn.zip/package /opt/softinstall/hadoop-3.2.3/logs/userlogs/application_1692453841261_0117/container_1692453841261_0117_01_000002/structured-out-1.json INFO /opt/softinstall/hadoop-3.2.3/data/tmp/nm-local-dir/usercache/root/appcache/application_1692453841261_0117/container_1692453841261_0117_01_000002
2023-08-28 17:04:39,453 - Directory['/opt/softinstall/hadoop-3.2.3/data/tmp/nm-local-dir/usercache/root/appcache/application_1692453841261_0117/container_1692453841261_0117_01_000002/app/install'] {'action': ['delete']}
1.檔案衝突:把trino的組態檔 etc和data 目錄都生成到AGENT_WORK_ROOT下,這樣就能解決排程到同一臺機器上這2個檔案衝突的問題。
主要修改params.py
2.埠衝突:加上隨機埠設定,註釋掉config.properties-WORKER.j2 模板裡面的http-server.http.port={{presto_work_port}},增加隨機埠的設定寫入。
具體參考實現trino on yarn排程到同一機器上多範例埠衝突問題處理_qq_2368521029的部落格-CSDN部落格
我們對trino修改過一些功能,所以正常打包出來的檔案並不能適合我們的環境,需要把我們自己的Trino給打包進去。
1.首先正常打包出來的包目錄trino-yarn/package/files下就是對應的 trino-server 檔案,把我們自己的Trino去掉etc和data目錄,打包替換成對應的包
2.修改params.py,configure.py 以及config.properties-WORKER.j2 模板,對應生成自己需要的模板
3.打包重新上傳到hdfs指定目錄
../bin/slider package --install --name trino --package trino-yarn.zip --replacepkg
4.指定JDK,目前我們是把JDK直接跟trino-server的目錄打包在一起,修改下啟動命令