Assigning data to hash taking longer than expected - Ruby on Rails/Ruby -


since axlsx can't handle excel files contain large amount of data without having issues, i've decided build own specific purpose. generates spreadsheet 0 issues far opening it; however, takes lot longer expect.

here's i'm working with:

string_file = "#{tmp_folder}/xl/sharedstrings.xml" sheet_file = "#{tmp_folder}/xl/worksheets/sheet1.xml"  col_range = ["a", "b", "c"]  ref_strings = {} row_num = 2 # because have header row.  strings_template = nokogiri::xml(file.open(string_file)) main_file = nokogiri::xml(file.open(sheet_file))  amount_of_strings = strings_template.to_xml.scan("<si>").length # counts number of existing strings.  issues = report.find(1).issues nokogiri::xml::builder.with(main_file.at_xpath("//*[name()='sheetdata']")) |row|     nokogiri::xml::builder.with(strings_template.at_xpath("//*[name()='sst']")) |xml|         issues.each |issue|             row.row("customheight" => "1", "ht" => "14.15", "r" => "#{row_num}", "spans" => "1:17", "x14ac:dydescent" => "0.35") {             col_range.each |col|                 col_num = col_range.index(col)                 tmp_string_text = case col_num                 when 0                     issue.details                 when 1                     issue.system.ip                 when 2                     issue.system.dns_name                 end                  if !ref_strings.keys.include? tmp_string_text                     ref_strings[tmp_string_text] = amount_of_strings                      string_ref = amount_of_strings                                   xml.si {                         xml.t tmp_string_text                      }                     amount_of_strings += 1                 else                     string_ref = ref_strings[tmp_string_text]                 end                   row.c("r" => "#{col}#{row_num}", "t" => "s") {                     row.v string_ref                 }             end             row_num += 1             }         end     end end 

the way excel adds data cell (in xml format) inserting text <si><t>data here</t></si> tags within string_file (in example), , builds out row in sheet_file (in example) , references text based on <si> index number. in example, again, works fine.

however, i'm noticing process takes longer when un-comment if condition adds text ref_strings hash. it's checking see if text exists, , if not, adds looks {"here's random text" => "4"}.

is there way can speed while adding text hash reference later? if comment out 1 line adds text hash, 30k rows of data finishes producing excel spreadsheet in 26 seconds. if add in there, takes 20 minutes (no joke).


Comments