연구 노트

Anaconda 프롬프트 명령어 정리 2021.11.17
Rotating NetCDF file handling in ArcGIS 2021.11.17
R 파이썬 윈도우에서 한글 깨질때 2021.11.16
ModuleNotFoundError: No module named 2021.11.16
파이썬: 디렉토리 경로 관련 명령어 2021.11.16
에러 ModuleNotFoundError: No module named 'matplotlib' (Rstudio) 2021.11.11
Anaconda 제거 방법 (miniconda 동일) 2021.11.11
R Keras - 모델링 결과가 loss: NaN, mae: NaN 등으로 나올때 2021.11.05
R Keras -csv 파일 읽어 train와 test 데이터로 나누기 2021.11.05
R ggplot 누적 막대 그래프 그리기 2021.11.05

Anaconda 프롬프트 명령어 정리

airmaster 2021. 11. 17. 12:41

2021. 11. 17. 12:41

728x90

(base) conda info --envs

(base) conda list

(base) conda remove --prefix [PATH] --all

(base) conda remove --name --all

(base) activate test

(test)

명령어 conda 나 pip 가 같은 역할의 명령을 수행함.

주의: 모듈을 설치할때는 conda 대신 pip 사용하길 추천함 (아래 글 참고). 버전 충돌이나 miniconda 충돌 등이 발생할 수 있음.

Tensorflow GPU 작동하지 않을때, 재설치 없이 수리하는 법

본 문서는 NVIDIA 그래픽 카드와 Anaconda를 설치했음에도 tensorflow_gpu가 작동하지 않은 경우, 모든 설치 앱을 갈아없지 않고 수리하는 방법을 정리한 것임. 증상 아래 그림에서 CPU만 인식한다. 몇 번

aeir.tistory.com

728x90

저작자표시 비영리 변경금지

Rotating NetCDF file handling in ArcGIS

2021. 11. 17. 12:28

R 파이썬 윈도우에서 한글 깨질때

airmaster 2021. 11. 16. 16:21

2021. 11. 16. 16:21

728x90

encoding은 OS 시스템 마다 다르다.

linux 는 utf8 사용

windows 는 cp949 사용

해결

import pandas as pd

df=pd.read_csv(infile, encoding='cp949')

참고

linux 에서는 encoding 을 설정할 필요가 없다.

pandast to_csv() 보면 default: utf-8 이 이미 설정되어 있음.

728x90

저작자표시 비영리 변경금지

ModuleNotFoundError: No module named

airmaster 2021. 11. 16. 14:27

2021. 11. 16. 14:27

728x90

ModuleNotFoundError: No module named 'sklearn'

sklearn 모듈이 없다는 의미
즉, 모듈이 설치되어 있지 않기에, CONDA 모드에서 pip 로 설치해 주면 된다.

Rstudio의 경우,

반드시 r-rectitulate 가상환경에서 설치해야.
자세한 내용은 아래 문서 참고
https://aeir.tistory.com/entry/ModuleNotFoundError-No-module-named-matplotlib

ModuleNotFoundError: No module named 'keras' 인 경우, 'tensorflow'도 같이 설치해야 됨.

728x90

저작자표시 비영리 변경금지

파이썬: 디렉토리 경로 관련 명령어

airmaster 2021. 11. 16. 14:20

2021. 11. 16. 14:20

728x90

파이썬은 파일 경로 또는 디렉토리와 관련한 코드가 많이 필요

파일 및 디렉토리 경로에 관한 함수는 모두 os 모듈을 사용하기 때문에 os 모듈의 import가 필요

현재 작업 폴더 얻기: os.getcwd()

get current working directory

print(os.getcwd())# /Users/evan/dev/python/web-crawler-py/parsed_data

디렉토리 변경: os.chdir(path)

특정 경로에 대해 절대 경로 얻기: os.path.abspath(path)

경로 중 디렉토리명만 얻기: os.path.dir(path)

경로 중 파일명만 얻기: os.path.basename(path)

경로 중 디렉토리명과 파일명 나누어 얻기: os.path.split(path)

dir, file = os.path.split("/Users/evan/dev/python/web-crawler-py/parsed_data")print(dir, file, sep="\n")# /Users/evan/dev/python/web-crawler-py# parsed_data

파일 경로를 리스트로 얻기: 파일 경로를 os.path.sep(OS별 파일 경로 나는 문자)를 이용해 split

print("/Users/evan/dev/python/web-crawler-py/parsed_data".split(os.path.sep))# ['', 'Users', 'evan', 'dev', 'python', 'web-crawler-py', 'parsed_data']

경로를 병합하여 새 경로 생성: os.path.join(path, path1, path2, ...) 넘겨준 path들을 묶어 하나의 경로로 만듬.

print(os.path.join("/Users/evan/dev/python/web-crawler-py/parsed_data", "test"))# /Users/evan/dev/python/web-crawler-py/parsed_data/test

디렉토리 안의 파일/서브 디렉토리 리스트: os.listdir(path) path 하위에 있는 파일, 디렉토리 리스트를 보여줍니다.

print(os.listdir("/Users/evan/dev/python/web-crawler-py/parsed_data"))# ['migrations', 'models.py', '__init__.py', '__pycache__', 'apps.py', 'parser.py', 'admin.py', 'tests.py', 'views.py']

파일 혹은 디렉토리가 존재하는지 체크: os.path.exists(path)

print(os.path.exists("/Users/evan/dev/python/web-crawler-py/parsed_data"))# Trueprint(os.path.exists("/Users/evan/dev/python/web-crawler-py/parsed_data/admin.py"))# True

디렉토리가 존재하는지 체크: os.path.isdir(path)

print(os.path.isdir("/Users/evan/dev/python/web-crawler-py/parsed_data"))# Trueprint(os.path.isdir("/Users/evan/dev/python/web-crawler-py/parsed_data/admin.py"))# False

파일이 존재하는지 체크: os.path.isfile(path)

print(os.path.isfile("/Users/evan/dev/python/web-crawler-py/parsed_data"))# Falseprint(os.path.isfile("/Users/evan/dev/python/web-crawler-py/parsed_data/admin.py"))# True

파일의 크기: os.path.getsize(path)

print(os.path.getsize("/Users/evan/dev/python/web-crawler-py/parsed_data"))# 352

자 그럼 지금까지 배운것으로 다음 구문이 도대체 뭘 의미하는지 알아보겠습니다.

BASE_DIR = os.path.dirname(os.path.abspath(__file__))

import osprint(__file__)# /Users/evan/dev/python/web-crawler-py/parsed_data/parser.pyprint(os.path.abspath(__file__))# Users/evan/dev/python/web-crawler-py/parsed_data/parser.pyprint(os.path.dirname(os.path.abspath(__file__)))# /Users/evan/dev/python/web-crawler-py/parsed_data

1. __file__은 해당 파일이 모듈로서 로드되면 __file__이라는 이름으로 설정됩니다.

2. __file__을 절대 경로화 해줍니다. (맨 앞 /가 붙는 것 빼고는 차이점이 없습니다.)

3. 절대 경로에서 디렉터리 경로를 가져옵니다.

4. BASE_DIR이라는 변수에 저장에 사용합니다.

출처: https://itmining.tistory.com/122 [IT 마이닝]

728x90

저작자표시 비영리 변경금지

에러 ModuleNotFoundError: No module named 'matplotlib' (Rstudio)

airmaster 2021. 11. 11. 14:27

2021. 11. 11. 14:27

728x90

>>> from matplotlib import pyplot as plt

ModuleNotFoundError: No module named 'matplotlib'

이 문서는 위 에러를 고치는 과정을 통해서, Rstudio에서 python 사용시 Anacona와 Minconda 모듈 충돌 발생하는 경우의 전반적인 문제를 한 번에 해결할 수 있다.

R 4.1.0 이상에서도 파이썬으로 딥러닝 코드를 구현할 수 있다.

Rstudio 설치 후, 맨 처음 import os 같은 파이썬 코드를 불러오면, 아래와 같이 reticulate::repl_pyton() 이라는 모듈을 자동으로 설치한다. 자세히 보면, Miniconda3가 설치되는 것을 볼 수 있다.

이 Minconda 가 기존에 설치된 Anacona 와 충돌을 일으키는 경우가 있다.

에러

예를 들면, 아래와 같이 matplotlib 문제이다.

>>> from matplotlib import pyplot as plt

ModuleNotFoundError: No module named 'matplotlib'

해결

PC에 설치된 Anaconda와 Miniconda

모듈이 설치되어 있지 않은가 해서, Anaconda Prompt (anacond3) 상에서

> pip list 하면

아래와 같이 matplolib이 이미 설치되어 있음을 알 수 있다.

따라서, anaconda가 아니라 Miniconda를 살펴봐야 한다.

r-reticulate

Rstudio는 reticulate 패키지를 설치하는데, r-reticulate 가상 환경에서 모둘을 설치해 주어야 한다.

아래와 같은 순서로 진행하면, Rstudio에서 에러를 피할 수 있다.

1. Anaconda Prompt (R~MINI~1) 열기

2. (base) > conda env list

3. (base) > activate r-reticulate

4. (base) > pip list

어디에도 matplotlib이 없다.

5. (r-reticulate) pip install matplolib

conda install 은 가급적 피하시길 (관련글: Tensorflow GPU 작동하지 않을때, 재설치 없이 수리하는 법)

6. 마지막으로 Rstudio 에서

>>> from matplotlib import pyplot as plt

하면 에러 사라짐.

728x90

저작자표시 비영리 변경금지

Anaconda 제거 방법 (miniconda 동일)

airmaster 2021. 11. 11. 14:06

2021. 11. 11. 14:06

728x90

1. 혹시 모를 버전 충돌 등을 고려하여 가상환경을 먼저 제거한다.

2. Anacona를 제거한다.

(이 방법은 Miniconda 에서도 동일하다)

1. Anaconda 가상환경 제거

아래 설명에서는 r-reticulate 가상 환경 모두 지우는 방법

(base) >conda env list
# conda environments:
#
base                  *  C:\Users\chpark\miniconda3
chpark                   C:\Users\chpark\miniconda3\envs\chpark
r-reticulate             C:\Users\chpark\miniconda3\envs\r-reticulate

(base) > conda remove --name r-reticulate --all

Remove all packages in environment C:\Users\chpark\miniconda3\envs\r-reticulate:

## Package Plan ##

  environment location: C:\Users\chpark\miniconda3\envs\r-reticulate

Proceed ([y]/n)?
The following packages will be REMOVED:

  certifi-2019.11.28-py36_0
  pip-20.0.2-py36_1
  python-3.6.10-h9f7ef89_0
  setuptools-46.0.0-py36_0
  sqlite-3.31.1-he774522_0
  vc-14.1-h0510ff6_4
  vs2015_runtime-14.16.27012-hf0eaf9b_1
  wheel-0.34.2-py36_0
  wincertstore-0.2-py36h7fe50ca_0

Proceed ([y]/n)?

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

(base) >conda env list
# conda environments:
#
base * C:\Users\chpark\miniconda3
chpark C:\Users\chpark\miniconda3\envs\chpark

(base) C:\Users\chpark>

r-reticulate 가상환경이 모두 사라진 것을 확인 할 수 있다.

같은 방법으로 (base)만 남기고 다 제거한다.

(base)는 prompt 상에서는 제거되지 않으므으로 아래 방법으로 시스템에서 제거한다.

2. Anaconda 제거 (windows 10)

아나콘다의 경우, C:\Users\miniconda3\Uninstall-Anaconda3.exe 실행

미니콘다의 경우, C:\Users\miniconda3\Uninstall-Miniconda3.exe 실행

728x90

저작자표시 비영리 변경금지

R Keras - 모델링 결과가 loss: NaN, mae: NaN 등으로 나올때

airmaster 2021. 11. 5. 18:19

2021. 11. 5. 18:19

728x90

여기서는 모델이 수행되는 과정에서는 에러가 없었으나, 최종 결과가 loss: NaN, accuracy:NaN, mae:NaN 등으로 나오는 경우, 필자가 찾아낸 문제해결 법이다.

지금까지 2가지 경우에 대해서 경험하였고, 다음과 같이 해결할 수 있다.

1. 입력 데이터에 NA 가 들어있는 경우

이때 아래와 같이 데이터 셋에서 NA를 제거해 준 다음 사용하면 된다.

> data.set <- na.omit(<INPUT DATA>)

> str(data.set)

2. 활성화 함수 선택이 문제인 경우

1번으로 해결되지 않을 때, layer_dense 내 activation 함수를 "softmax"에서 "relu"로 바꾸면 된다. single output 인 경우, softmax를 사용하면 이런 현상이 발생할 수 있다. 여기(https://github.com/keras-team/keras/issues/2134) 내용의 댓글들을 참고하시오.

3. 입력 데이터에 Inf 가 들어있는 경우

아래 블로그를 참조하시라.

https://klavier.tistory.com/entry/R%EC%97%90%EC%84%9C-NAN%EC%9D%B4%EB%82%98-INF%EB%A1%9C-%EC%9D%B8%ED%95%B4-%EC%A0%9C%EB%8C%80%EB%A1%9C-%EC%BD%94%EB%93%9C%EA%B0%80-%EB%8F%8C%EC%A7%80-%EC%95%8A%EC%9D%84-%EB%95%8C-%EC%A1%B0%EC%B9%98%ED%95%98%EB%8A%94-%EB%B0%A9%EB%B2%95

728x90

저작자표시 비영리 변경금지

R Keras -csv 파일 읽어 train와 test 데이터로 나누기

airmaster 2021. 11. 5. 18:18

2021. 11. 5. 18:18

728x90

책이나 사이트에 공개된 잘 만들어진 예제를 이용해서, 자신의 데이터에 바로 적용할 때, 필자와 같은 왕초보들은 항상, as always, as usual, 필연적으로, 반드시, 운명적으로, .... 한방에 작동하지 않고 여러가지 문제들로 인해 머리가 지끈지끈 아픈 상황과 문제가 발생할 수 있다. 아니, 발생한다. 하나하나 잡아보자.

딥러닝 책 등에서 제공된 예제 파일들은 모두 Keras 등의 패키지 내에 포함된 예제 파일들이다. 그래서 실제 내가 만든 .csv 파일을 읽어 들이는데, 입력 포맷이 맞지 않아 헤맬 수 있다. 이런 경우, 모델이 잘 돌아가지만, input_data 값이 없다는 에러가 발생할 수 있다.

아래와 같이 sampling 하면, 일반 책에서 사용하는 예제 파일을 읽어들이는 포맷으로 쉽게 변환된다. 아래는 13개 컬럼으로 구성된 데이터 셋의 경우의 예임.

> set.seed(1)

# sample 함수는 수행할 때 마다 다른 난수를 추출하기 때문에 난수를 고정시켜 같은 값이 나오도록 하기 위함이다.

> smp = sample(1:nrow(<INPUT DATA>), nrow(<INPUT DATA>)/2)

# 2로 나눈 것은 절반의 데이터만 추출해서 쓰겠다는 의미

> train_data <- <INPUT DATA>[smp, 1:12]

> train_targets <- <INPUT DATA>[smp, 13]

> test_data <- <INPUT DATA>[-smp, 1:12]

> test_targets <- <INPUT DATA>[-smp, 13]

이 후 keras model을 구성하면 된다.

728x90

저작자표시 비영리 변경금지

R ggplot 누적 막대 그래프 그리기

airmaster 2021. 11. 5. 18:06

2021. 11. 5. 18:06

728x90

출처: Grouped, stacked and percent stacked barplot in ggplot2 – the R Graph Gallery (r-graph-gallery.com)

Grouped barchart

A grouped barplot display a numeric value for a set of entities split in groups and subgroups. Before trying to build one, check how to make a basic barplot with R and ggplot2.

A few explanation about the code below:

input dataset must provide 3 columns: the numeric value (value), and 2 categorical variables for the group (specie) and the subgroup (condition) levels.
in the aes() call, x is the group (specie), and the subgroup (condition) is given to the fill argument.
in the geom_bar() call, position="dodge" must be specified to have the bars one beside each other.

# library library(ggplot2) # create a dataset specie <- c(rep("sorgho" , 3) , rep("poacee" , 3) , rep("banana" , 3) , rep("triticum" , 3) ) condition <- rep(c("normal" , "stress" , "Nitrogen") , 4) value <- abs(rnorm(12 , 0 , 15)) data <- data.frame(specie,condition,value) # Grouped ggplot(data, aes(fill=condition, y=value, x=specie)) + geom_bar(position="dodge", stat="identity")

Stacked barchart

A stacked barplot is very similar to the grouped barplot above. The subgroups are just displayed on top of each other, not beside.

The only thing to change to get this figure is to switch the position argument to stack.

# library library(ggplot2) # create a dataset specie <- c(rep("sorgho" , 3) , rep("poacee" , 3) , rep("banana" , 3) , rep("triticum" , 3) ) condition <- rep(c("normal" , "stress" , "Nitrogen") , 4) value <- abs(rnorm(12 , 0 , 15)) data <- data.frame(specie,condition,value) # Stacked ggplot(data, aes(fill=condition, y=value, x=specie)) + geom_bar(position="stack", stat="identity")

Percent stacked barchart

Once more, there is not much to do to switch to a percent stacked barplot. Just switch to position="fill". Now, the percentage of each subgroup is represented, allowing to study the evolution of their proportion in the whole.

# library library(ggplot2) # create a dataset specie <- c(rep("sorgho" , 3) , rep("poacee" , 3) , rep("banana" , 3) , rep("triticum" , 3) ) condition <- rep(c("normal" , "stress" , "Nitrogen") , 4) value <- abs(rnorm(12 , 0 , 15)) data <- data.frame(specie,condition,value) # Stacked + percent ggplot(data, aes(fill=condition, y=value, x=specie)) + geom_bar(position="fill", stat="identity")

Grouped barchart customization

As usual, some customization are often necessary to make the chart look better and personnal. Let’s:

add a title
use a theme
change color palette. See more here.
customize axis titles

# library library(ggplot2) library(viridis) library(hrbrthemes) # create a dataset specie <- c(rep("sorgho" , 3) , rep("poacee" , 3) , rep("banana" , 3) , rep("triticum" , 3) ) condition <- rep(c("normal" , "stress" , "Nitrogen") , 4) value <- abs(rnorm(12 , 0 , 15)) data <- data.frame(specie,condition,value) # Small multiple ggplot(data, aes(fill=condition, y=value, x=specie)) + geom_bar(position="stack", stat="identity") + scale_fill_viridis(discrete = T) + ggtitle("Studying 4 species..") + theme_ipsum() + xlab("")

Small multiple

Small multiple can be used as an alternative of stacking or grouping. It is straightforward to make thanks to the facet_wrap() function.

# library library(ggplot2) library(viridis) library(hrbrthemes) # create a dataset specie <- c(rep("sorgho" , 3) , rep("poacee" , 3) , rep("banana" , 3) , rep("triticum" , 3) ) condition <- rep(c("normal" , "stress" , "Nitrogen") , 4) value <- abs(rnorm(12 , 0 , 15)) data <- data.frame(specie,condition,value) # Graph ggplot(data, aes(fill=condition, y=value, x=condition)) + geom_bar(position="dodge", stat="identity") + scale_fill_viridis(discrete = T, option = "E") + ggtitle("Studying 4 species..") + facet_wrap(~specie) + theme_ipsum() + theme(legend.position="none") + xlab("")

728x90

저작자표시 비영리 변경금지

PREV 이전 1 ···13 14 15 16 17 NEXT 다음

연구 노트

ModuleNotFoundError: No module named 'sklearn'

Rstudio의 경우,

ModuleNotFoundError: No module named 'keras' 인 경우, 'tensorflow'도 같이 설치해야 됨.

현재 작업 폴더 얻기: os.getcwd()

디렉토리 변경: os.chdir(path)

특정 경로에 대해 절대 경로 얻기: os.path.abspath(path)

경로 중 디렉토리명만 얻기: os.path.dir(path)

경로 중 파일명만 얻기: os.path.basename(path)

경로 중 디렉토리명과 파일명 나누어 얻기: os.path.split(path)

파일 경로를 리스트로 얻기: 파일 경로를 os.path.sep(OS별 파일 경로 나는 문자)를 이용해 split

경로를 병합하여 새 경로 생성: os.path.join(path, path1, path2, ...) 넘겨준 path들을 묶어 하나의 경로로 만듬.

디렉토리 안의 파일/서브 디렉토리 리스트: os.listdir(path) path 하위에 있는 파일, 디렉토리 리스트를 보여줍니다.

파일 혹은 디렉토리가 존재하는지 체크: os.path.exists(path)

디렉토리가 존재하는지 체크: os.path.isdir(path)

파일이 존재하는지 체크: os.path.isfile(path)

파일의 크기: os.path.getsize(path)

에러

해결

r-reticulate

1. Anaconda 가상환경 제거

2. Anaconda 제거 (windows 10)

1. 입력 데이터에 NA 가 들어있는 경우

2. 활성화 함수 선택이 문제인 경우

3. 입력 데이터에 Inf 가 들어있는 경우

Grouped barchart

Stacked barchart

Percent stacked barchart

Grouped barchart customization

Small multiple

티스토리툴바