Wednesday, June 15, 2022

PowerShell Script : Convert UTF-8 text file into UTF-16 Unicode encoding file , salesforce dataloader export csv to UTF-16 Unicode SFDC 原生編碼字元轉換補遺

The native dataloader from Salesforce only export .csv file in UTF-8 encoding . And it is asked to exchange information with Therefore™ information management application which uses UTF-16 encoding data .

The pros and cons of UTF-8 and UTF-16 (Unicode) encoding can be found in here.
Storage and indexing speed etc. matters. 

The most easy way to do this is using powershell script to encode the .csv from Salesforce dataloader export to Unicode in mainly one line .PS script. Schedule this script into window task scheduler , done.

  

Get-content -encoding UTF8 $filename | Set-Content -encoding Unicode "$new_filename"

Script below shows how to convert multiple UTF-8 files in folder into UTF-16 files with error log tracking. 

Both UTF-8 and UTF-16 files will be available in the folder after conversion.
UTF-8 abc.csv will be cloned , and a UTF-16 abc_utf16.csv will be created in the same folder.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#Continue conversion even if one file failed , just log down the error.
$ErrorActionPreference="SilentlyContinue"
Stop-Transcript | out-null
$ErrorActionPreference = "Continue"
#-path of the log file
Start-Transcript -path "C:\ABCProject\log\utfLog.txt" -append
try {
#double bracket for path with space
#get all csv files inside that folder
get-childitem "C:\ABCProject\write\*.csv" | 

  foreach-object { 
    $name = $_.FullName;
    write-host "The file name" $name;
    $fname = $name.replace(".csv","_utf16.csv");
    write-host "The new file" $fname;
    get-content -encoding UTF8 $name  | Set-Content  -encoding Unicode  "$fname" 
}
 Stop-Transcript
 exit
}
catch {
  #log -message "Get-WmiObject cmdlet failed" -type "Error"
  write-host "Error found " $_.Exception.Message.ToString() -type "Error"

}

No comments:

Post a Comment

Something about Renpy For loop error : expected statement.

 It takes me over hour to debug. The simple fact is that under label, we cannot use For loop. One while is valid to be used under label. To ...