Category "Golang"

Basic File Operations – Golang

- - Golang, Tutorials

One of the most basic task when working on a server is the ability to effectively operate with the files and file system. Like many languages, Golang has convenient methods to work with files. The intentions of this post is to host a minimalist set of examples on working with files using Golang.

Creating an empty file
package main

import (
  "log"
  "os"
)

func createFile(){
  newFile, err := os.Create("testfile.txt")
  if err != nil{
     log.Fatal(err)
  }
  defer newFile.Close()

  log.Println(newFile)
  // operations on file follows
}


func main(){
  createFile()
}
Reading a File
func readFile(){
  fileReader, err := os.Open("test.txt")
  if err != nil{
     log.Fatal(err)
  }
   defer fileReader.Close()
  content, err := ioutil.ReadAll(fileReader)
  if err != nil{
     log.Fatal(err)
  }
  fmt.Println("Number of bytes read", len(content))
  fmt.Printf("Data as string : %s", content)


}
Writing to a file
func writeToFile(){
  err := ioutil.WriteFile("testwrite.txt", []byte("Dumping bytes to a file\n"), 0666)
  if err != nil{
     log.Fatal(err)
  }
}
Truncate File
func TruncateFile(){
  err := os.Truncate("test.txt", 100)
  if err != nil{
     log.Fatal(err)
  }else{
     fmt.Print("File Truncated")
  }

}
File Information
func FileInformation(){
  fileInfo, err := os.Stat("test.txt")
  if err != nil {
     log.Fatal(err)
  }
  fmt.Println("File name:", fileInfo.Name())
  fmt.Println("Size in bytes:", fileInfo.Size())
  fmt.Println("Permissions:", fileInfo.Mode())
}
Check if file exists or not
package main

import (
    "os"
    "fmt"
)

func main(){
    if _, err := os.Stat("test.txt"); os.IsNotExist(err){
        fmt.Println("The file does not exist")
    return
    }
    fmt.Println("The file exists!")
    //perform operations on the file/file contents
}
Move/Rename File
package main
 
import (
    "log"
    "os"
)
 
func main() {
    currentFileLocation := "/home/bhishan-1504/golangfilehandling/test.txt"
    toMoveLocation := "/home/bhishan-1504/golangrestapi/test.txt"
    err := os.Rename(currentFileLocation, toMoveLocation)
    if err != nil {
        log.Fatal(err)
    }
}
Copy File
package main

import (
  "io"
  "log"
  "os"
  "fmt"
)

func main(){
  source, err := os.Open("test.txt")
  if err != nil{
    log.Fatal(err)
  }
  defer source.Close()

  sourceCopied, err := os.OpenFile("copiedtest.txt", os.O_RDWR|os.O_CREATE, 0666)
  //sourceCopied, err := os.Create("copiedtest.txt")
  if err != nil{
    log.Fatal(err)
  }
  defer sourceCopied.Close()

  _, err = io.Copy(sourceCopied, source)
  if err != nil{
    log.Fatal(err)
  }

 
  fmt.Println("Copied!")
}

Web Scraping using Golang

- - Golang, Tutorials

Web Scraping can be beneficial to individuals and companies. The intentions of this post is to host a set of examples on Web Scraping using Golang and goquery. I will be using github’s trending page https://github.com/trending throughout this post for the examples, especially because it best suits for applying various goquery methods. There are two other versions of this article which replicates the same set of examples in Python and NodeJS.

Installation

go get github.com/PuerkitoBio/goquery

Get html of a page
package main
import (
    "log"
    "io"
    "os"
    "net/http"
)

func ScrapeHTML(){
    resp, err := http.Get("https://github.com/trending")
    if err != nil{
        log.Fatal(err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != 200{
        log.Fatalf("status code error: %d %s", resp.StatusCode, resp.Status)
    }
    io.Copy(os.Stdout, resp.Body)    
}

func main(){
    ScrapeHTML()
}

Using goquery(golang library) to get title from a page

package main
import (
    "fmt"
    "log"
    "net/http"
    "github.com/PuerkitoBio/goquery"
)

func ScrapeHTML(){
    resp, err := http.Get("https://github.com/trending")
    if err != nil{
        log.Fatal(err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != 200{
        log.Fatalf("status code error: %d %s", resp.StatusCode, resp.Status)
    }

    doc, err := goquery.NewDocumentFromReader(resp.Body)
  if err != nil {
        log.Fatal(err)
    }
    fmt.Println(doc.Find("title").Text())
   
}

func main(){
    ScrapeHTML()
}
Output

$ go run example.go
Trending repositories on GitHub today · GitHub

Using goquery, Find single element by tag name, find multiple elements by tag name
package main
import (
    "fmt"
    "log"
    "strings"
    "net/http"
    "github.com/PuerkitoBio/goquery"
)


func scrapeUsingTagNames(){
    resp, err := http.Get("https://github.com/trending")
    if err != nil{
        log.Fatal(err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != 200{
        log.Fatalf("Status code error: %d %s", resp.StatusCode, resp.Status)
    }
    doc, err := goquery.NewDocumentFromReader(resp.Body)
    if err != nil{
        log.Fatal(err)
    }
    fmt.Println(doc.Find("title").Text())

    doc.Find("ol li").Each(func(i int, s *goquery.Selection){
        fmt.Println(strings.TrimSpace(s.Find("h3").Text()))
    })
}

func main(){
    scrapeUsingTagNames()
}
Output


$ go run example.go
Trending repositories on GitHub today · GitHubAsset 1Asset 1
you-dont-need / You-Dont-Need-Momentjs
ripienaar / free-for-dev
Nozbe / WatermelonDB
cjbarber / ToolsOfTheTrade
byoungd / English-level-up-tips-for-Chinese
TheAlgorithms / Python
thedaviddias / Front-End-Checklist
zziz / pwc
dawnlabs / carbon
CyC2018 / CS-Notes
Avik-Jain / 100-Days-Of-ML-Code
donnemartin / system-design-primer
mariusandra / pigeon-maps
Snailclimb / JavaGuide
JavaNoober / BackgroundLibrary
crossoverJie / JCSprout
Microsoft / nni
PansonPanson / Java-Notes
date-fns / date-fns
sindresorhus / ky
mciastek / sal
rwv / chinese-dos-games
vuejs / vue
GoogleCloudPlatform / open-match
lin-xin / vue-manage-system

Getting Attributes of an element
package main
import (
    "fmt"
    "log"
    "net/http"
    "github.com/PuerkitoBio/goquery"
)

func scrapeAttributes(){
    resp, err := http.Get("https://github.com/trending")
    if err != nil{
        log.Fatal(err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != 200{
        log.Fatalf("Status code error: %d %s", resp.StatusCode, resp.Status)
    }
    doc, err := goquery.NewDocumentFromReader(resp.Body)
    if err != nil{
        log.Fatal(err)
    }
    fmt.Println(doc.Find("title").Text())

    doc.Find("ol li").Each(func(i int, s *goquery.Selection){
        href, has_attr := s.Find("a").First().Attr("href")
        if has_attr{
            fmt.Println("https://github.com" + href)
        }

    })
}

func main(){
    scrapeAttribtutes()
}

Output


$ go run example.go
Trending repositories on GitHub today · GitHubAsset 1Asset 1
https://github.com/you-dont-need/You-Dont-Need-Momentjs
https://github.com/ripienaar/free-for-dev
https://github.com/Nozbe/WatermelonDB
https://github.com/cjbarber/ToolsOfTheTrade
https://github.com/byoungd/English-level-up-tips-for-Chinese
https://github.com/TheAlgorithms/Python
https://github.com/thedaviddias/Front-End-Checklist
https://github.com/zziz/pwc
https://github.com/dawnlabs/carbon
https://github.com/CyC2018/CS-Notes
https://github.com/Avik-Jain/100-Days-Of-ML-Code
https://github.com/donnemartin/system-design-primer
https://github.com/mariusandra/pigeon-maps
https://github.com/Snailclimb/JavaGuide
https://github.com/JavaNoober/BackgroundLibrary
https://github.com/crossoverJie/JCSprout
https://github.com/Microsoft/nni
https://github.com/PansonPanson/Java-Notes
https://github.com/date-fns/date-fns
https://github.com/sindresorhus/ky
https://github.com/mciastek/sal
https://github.com/rwv/chinese-dos-games
https://github.com/vuejs/vue
https://github.com/GoogleCloudPlatform/open-match
https://github.com/lin-xin/vue-manage-system

 

Using class name or other attributes to get element
package main
import (
    "fmt"
    "log"
    "strings"
    "net/http"
    "github.com/PuerkitoBio/goquery"
)
func scrapeViaClassName(){
    resp, err := http.Get("https://github.com/trending")
    if err != nil{
        log.Fatal(err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != 200{
        log.Fatalf("Status code error: %d %s", resp.StatusCode, resp.Status)
    }
    doc, err := goquery.NewDocumentFromReader(resp.Body)
    if err != nil{
        log.Fatal(err)
    }
    fmt.Println(doc.Find("title").Text())

    doc.Find("ol li").Each(func(i int, s *goquery.Selection){
        fmt.Println(strings.TrimSpace(s.Find(".float-sm-right").Text()))
    })
}


func main(){
    scrapeViaClassName()
}
Output


$ go run example.go
Trending repositories on GitHub today · GitHub
625 stars today
476 stars today
407 stars today
392 stars today
332 stars today
316 stars today
304 stars today
274 stars today
249 stars today
201 stars today
206 stars today
188 stars today
192 stars today
165 stars today
154 stars today
141 stars today
153 stars today
146 stars today
153 stars today
149 stars today
145 stars today
134 stars today
124 stars today
137 stars today
117 stars today

Navigate childrens from an element
package main
import (
    "fmt"
    "log"
    "strings"
    "net/http"
    "github.com/PuerkitoBio/goquery"
)
func navigateChildrens(){
    resp, err := http.Get("https://github.com/trending")
    if err != nil{
        log.Fatal(err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != 200{
        log.Fatalf("Status code error: %d %s", resp.StatusCode, resp.Status)
    }
    doc, err := goquery.NewDocumentFromReader(resp.Body)
    if err != nil{
        log.Fatal(err)
    }
    fmt.Println(doc.Find("title").Text())
    olSelection := doc.Find("ol")
    olSelection.Children().Each(func(i int, s *goquery.Selection){ // using .Children() on the ol selection to get all li
        fmt.Println(strings.TrimSpace(s.Find("h3").Text()))
    })
}


func main(){
    navigateChildrens()
}
Output


$ go run example.go
Trending repositories on GitHub today · GitHub
you-dont-need / You-Dont-Need-Momentjs
ripienaar / free-for-dev
Nozbe / WatermelonDB
cjbarber / ToolsOfTheTrade
byoungd / English-level-up-tips-for-Chinese
TheAlgorithms / Python
thedaviddias / Front-End-Checklist
zziz / pwc
dawnlabs / carbon
CyC2018 / CS-Notes
Avik-Jain / 100-Days-Of-ML-Code
donnemartin / system-design-primer
mariusandra / pigeon-maps
Snailclimb / JavaGuide
JavaNoober / BackgroundLibrary
crossoverJie / JCSprout
Microsoft / nni
PansonPanson / Java-Notes
date-fns / date-fns
sindresorhus / ky
mciastek / sal
rwv / chinese-dos-games
vuejs / vue
GoogleCloudPlatform / open-match
lin-xin / vue-manage-system

The .children will only return the immediate childrens of the parent element.

Navigating previous and next siblings of an element
package main
import (
    "fmt"
    "log"
    "strings"
   "net/http"
    "github.com/PuerkitoBio/goquery"
)

func navigateSiblings(){
    resp, err := http.Get("https://github.com/trending")
    if err != nil{
        log.Fatal(err)
    }
    defer resp.Body.Close()
    if resp.StatusCode != 200{
        log.Fatalf("Status code error: %d %s", resp.StatusCode, resp.Status)
    }
    doc, err := goquery.NewDocumentFromReader(resp.Body)
    if err != nil{
        log.Fatal(err)
    }
    fmt.Println(doc.Find("title").Text())
    liSelection := doc.Find("ol li")
    fifthElement := liSelection.Eq(4) // using Eq() and passing the index we can navigate to the element with given index
    fmt.Println(strings.TrimSpace(fifthElement.Find("h3").Text()))
    fourthElement := fifthElement.Prev()
    fmt.Println(strings.TrimSpace(fourthElement.Find("h3").Text()))
    sixthElement := fifthElement.Next()
    fmt.Println(strings.TrimSpace(sixthElement.Find("h3").Text()))
}


func main(){
   navigateSiblings()
}
Output


$ go run example.go
Trending repositories on GitHub today · GitHub
byoungd / English-level-up-tips-for-Chinese
cjbarber / ToolsOfTheTrade
TheAlgorithms / Python

Putting it all together(Github Trending Scraper using Golang)
package main
import (
    "fmt"
    "log"
    "strings"
    //"io"
    //"os"
    "net/http"
    "github.com/PuerkitoBio/goquery"
)

func githubTrendingScraper(){
    resp, err := http.Get("https://github.com/trending")
    if err != nil{
        log.Fatal(err)
    }
    defer resp.Body.Close()
    if resp.StatusCode != 200{
        log.Fatalf("Status code error: %d %s", resp.StatusCode, resp.Status)
    }
    doc, err := goquery.NewDocumentFromReader(resp.Body)
    if err != nil{
        log.Fatal(err)
    }

    fmt.Println(doc.Find("title").Text())
    doc.Find("ol li").Each(func (i int, s *goquery.Selection){
        repositoryName := strings.TrimSpace(s.Find("h3").Text())
        totalStarsToday := strings.TrimSpace(s.Find(".float-sm-right").Text())
        href, has_attr := s.Find("a").Attr("href")
        if !has_attr{
            href = "No valid url found"
        }
        fmt.Println(repositoryName, "\t", totalStarsToday, "\t", "https://github.com" + href)
    })

}


func main(){
   githubTrendingScraper()
}
Output


$ go run example.go
Trending repositories on GitHub today · GitHub
you-dont-need / You-Dont-Need-Momentjs      625 stars today https://github.com/you-dont-need/You-Dont-Need-Momentjs
ripienaar / free-for-dev                    476 stars today https://github.com/ripienaar/free-for-dev
Nozbe / WatermelonDB                        407 stars today https://github.com/Nozbe/WatermelonDB
cjbarber / ToolsOfTheTrade                  392 stars today https://github.com/cjbarber/ToolsOfTheTrade
byoungd / English-level-up-tips-for-Chinese 332 stars today https://github.com/byoungd/English-level-up-tips-for-Chinese
TheAlgorithms / Python                      316 stars today https://github.com/TheAlgorithms/Python
thedaviddias / Front-End-Checklist          304 stars today https://github.com/thedaviddias/Front-End-Checklist
zziz / pwc                                  274 stars today https://github.com/zziz/pwc
dawnlabs / carbon                           249 stars today https://github.com/dawnlabs/carbon
CyC2018 / CS-Notes                          201 stars today https://github.com/CyC2018/CS-Notes
Avik-Jain / 100-Days-Of-ML-Code             206 stars today https://github.com/Avik-Jain/100-Days-Of-ML-Code
donnemartin / system-design-primer          188 stars today https://github.com/donnemartin/system-design-primer
mariusandra / pigeon-maps                   192 stars today https://github.com/mariusandra/pigeon-maps
Snailclimb / JavaGuide                      165 stars today https://github.com/Snailclimb/JavaGuide
JavaNoober / BackgroundLibrary              154 stars today https://github.com/JavaNoober/BackgroundLibrary
crossoverJie / JCSprout                     141 stars today https://github.com/crossoverJie/JCSprout
Microsoft / nni                             153 stars today https://github.com/Microsoft/nni
PansonPanson / Java-Notes                   146 stars today https://github.com/PansonPanson/Java-Notes
date-fns / date-fns                         153 stars today https://github.com/date-fns/date-fns
sindresorhus / ky                           149 stars today https://github.com/sindresorhus/ky
mciastek / sal                              145 stars today https://github.com/mciastek/sal
rwv / chinese-dos-games                     134 stars today https://github.com/rwv/chinese-dos-games
vuejs / vue                                 124 stars today https://github.com/vuejs/vue
GoogleCloudPlatform / open-match            137 stars today https://github.com/GoogleCloudPlatform/open-match
lin-xin / vue-manage-system                 117 stars today https://github.com/lin-xin/vue-manage-system