distcc 패키지 설치하고, tensorflow lite 빌드 시도
원래는 30분 정도 걸렸는데 (rpi 3b, 4core 기준) 얼마나 줄어들려나?
(느낌으로는 SD 메모리라 disk io로 인해 오히려 더 느려질지도 모르겠다는 불안감이..)
접속이 안되는 것 같아서 다른 문서들을 자세히 보니 설정을 제대로 안했네!
distcc[946] (dcc_build_somewhere) Warning: failed to distribute, running locally instead distcc[946] (dcc_parse_hosts) Warning: /home/pi/.distcc/zeroconf/hosts contained no hosts; can't distribute work distcc[946] (dcc_zeroconf_add_hosts) CRITICAL! failed to parse host file. |
/etc/default/ditscc 파일에서 allow와 listener를 수정해주고 service distcc restart 하면 끝!
$ cat /etc/default/distcc # Defaults for distcc initscript # sourced by /etc/init.d/distcc
# # should distcc be started on boot? # STARTDISTCC="true"
#STARTDISTCC="false"
# # Which networks/hosts should be allowed to connect to the daemon? # You can list multiple hosts/networks separated by spaces. # Networks have to be in CIDR notation, e.g. 192.168.1.0/24 # Hosts are represented by a single IP address # # ALLOWEDNETS="127.0.0.1"
ALLOWEDNETS="127.0.0.1 192.168.0.0/16"
# # Which interface should distccd listen on? # You can specify a single interface, identified by it's IP address, here. # # LISTENER="127.0.0.1"
LISTENER=""
# # You can specify a (positive) nice level for the distcc process here # # NICE="10"
NICE="10"
# # You can specify a maximum number of jobs, the server will accept concurrently # # JOBS=""
JOBS=""
# # Enable Zeroconf support? # If enabled, distccd will register via mDNS/DNS-SD. # It can then automatically be found by zeroconf enabled distcc clients # without the need of a manually configured host list. # ZEROCONF="true"
#ZEROCONF="false" |
MAKEFLAGS에 CC=/usr/lib/distcc/gcc 이 포인트 이긴 한데..
tensorflow/tensorflow/lite/tools/make $ cat ./build_rpi_lib.sh
#!/bin/bash
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
set -x
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
TENSORFLOW_DIR="${SCRIPT_DIR}/../../../.."
FREE_MEM="$(free -m | awk '/^Mem/ {print $2}')"
# Use "-j 4" only memory is larger than 2GB
if [[ "FREE_MEM" -gt "2000" ]]; then
NO_JOB=4
else
NO_JOB=1
fi
export MAKEFLAGS="CXX=/usr/lib/distcc/g++ CC=/usr/lib/distcc/gcc"
make -j 8 TARGET=rpi -C "${TENSORFLOW_DIR}" -f tensorflow/lite/tools/make/Makefile $@
#make -j ${NO_JOB} CC=/usr/lib/distcc/gcc TARGET=rpi -C "${TENSORFLOW_DIR}" -f tensorflow/lite/tools/make/Makefile $@
/etc/distcc/hosts 에 사용할 노드 이름을 넣으면 되는데 자기 자신이 들어가지 않으면
distcc 에서는 슬레이브 노드들로만 빌드를 하게 된다.
# As described in the distcc manpage, this file can be used for a global # list of available distcc hosts. # # The list from this file will only be used, if neither the # environment variable DISTCC_HOSTS, nor the file $HOME/.distcc/hosts # contains a valid list of hosts. # # Add a list of hostnames in one line, seperated by spaces, here. # tf2 tf3 +zeroconf |
가끔 이런거 나오는데 그냥 무시하면 zeroconf에 의해서 붙는지 슬레이브 노드(?) 쪽 cpu를 빨아먹긴 한다.
distcc[1323] (dcc_build_somewhere) Warning: failed to distribute, running locally instead distcc[1332] (dcc_build_somewhere) Warning: failed to distribute, running locally instead |
[링크 : http://openframeworks.cc/ko/setup/raspberrypi/raspberry-pi-distcc-guide/]
[링크 : http://jtanx.github.io/2019/04/19/rpi-distcc-node/]
+
/var/log/distcc.log를 보는데
정상적으로 잘되면 COMPILE_OK가 뜨지만
어느순간 갑자기 client fd disconnected가 뜨면서 빌드가 멈춘다.
근데 time:305000ms 정도 대충 5분 timewait 걸리는것 같아서
오히려 안하니만 못한 상황..
distccd[14090] (dcc_job_summary) client: 192.168.52.209:40940 COMPILE_OK exit:0 sig:0 core:0 ret:0 time:16693ms g++ tensorflow/lite/kernels/cpu_backend_gemm_eigen.cc
distccd[14091] (dcc_collect_child) ERROR: Client fd disconnected, killing job
distccd[14091] (dcc_writex) ERROR: failed to write: Broken pipe
distccd[14091] (dcc_job_summary) client: 192.168.52.209:40932 CLI_DISCONN exit:107 sig:0 core:0 ret:107 time:307172ms
아무튼 위와 같은 에러를 내며 뻗을때 개별 노드에서는 이런식으로 IO가 미쳐 날뛴다.
--total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai stl| read writ| recv send| in out | int csw
5 2 10 83 0| 928k 4048k|1063B 252B| 68k 2040k|1830 3320 0 3 27 69 0|7840M 27M|2919k 73k|1512k 11M| 245k 402k missed 238 ticks 2 1 0 97 0| 176k 0 | 0 0 |8192B 0 | 19 23 missed 2 ticks
|
+
cpp,lzo를 넣어서 해볼까?
[링크 : https://wiki.gentoo.org/wiki/Distcc/ko]
+
export MAKEFLAGS="CXX=/usr/lib/distcc/g++ CC=/usr/lib/distcc/gcc" #export MAKEFLAGS="CXX=/usr/bin/distcc-pump CC=/usr/bin/distcc-pump" make -j 8 TARGET=rpi -C "${TENSORFLOW_DIR}" -f tensorflow/lite/tools/make/Makefile $@ #make -j ${NO_JOB} CC=/usr/lib/distcc/gcc TARGET=rpi -C "${TENSORFLOW_DIR}" -f tensorflow/lite/tools/make/Makefile $@ |
되는데 pump가 아닌거랑 동일하게 io가 폭주해서 뻗는건 동일하다.
$ distcc-pump ./build_rpi_lib.sh |
+
distccmon-text 는 slave node가 아니라 server node에서 해야 하는구나..