bash - ElasticSearch update all documents with bulk API and script -


i have application document field not required array such

"tags" : "tag1"

now application requires field array such

"tags" : ["tag1","tag2"]

currently in elasticsearch there 4.5m documents

so wrote bash script update 1000 documents takes on 2 minutes means take on 8 days run on 4.5m documents. seems i'm doing wrong. best way in elastic? here bash script

#!/bin/bash  echo "starting" ids=$(curl -xget 'http://elastichost/index/_search?size=1000' -d '{ "query" : {"match_all" : {}},"fields":"[_id]"}' | grep -po '"_id":.*?[^\\]",'| awk -f':' '{print $2}'| sed -e 's/^"//' -e 's/",$//') #create array out of ids array=($ids) #loop through ids , update them in "${!array[@]}"             echo "$i=>|${array[i]}|"             curl -xpost "http://elastichost/index/type/${array[i]}/_update" -d '               {                 "script" : "ctx._source.tags = [ctx._source.tags]"               }'     done echo "\nfinished" 

add "> /dev/null 2>&1 &" command, ensure process gets forked , doesn’t log anywhere.

the equivalent shell command looks this:

    curl -xpost "http://elastichost/index/type/${array[i]}/_update" -d '       {         "script" : "ctx._source.tags = [ctx._source.tags]"       }' > /dev/null 2>&1 & 

it takes little on 1ms fork process, uses around 4k of resident memory. while curl process takes standard ssl 300ms make request

on moderately sized machine, can fork around 100 https curl requests per second without them stacking in memory. without ssl, can more:

  • forking process without waiting output fast.
  • curl takes same time make request socket, processed out of band.
  • forking curl requires normal unix primitives.
  • forking sets single request few milliseconds, many concurrent forks start slow servers.

do not echo in terminal.

reference: link


Comments

Popular posts from this blog

javascript - Chart.js (Radar Chart) different scaleLineColor for each scaleLine -

apache - Error with PHP mail(): Multiple or malformed newlines found in additional_header -

java - Android – MapFragment overlay button shadow, just like MyLocation button -