Я запускаю код предварительного обучения BERT на облачных TPU из Compute Engine.
Каждый раз, когда я запускаю его, я получаю эту ошибку в 1 потоке, но обучение продолжается в обычном режиме.
Я запускал тот же код на TPU Google Colab, и он работал нормально.
для tpu_cluster_resolver im, передающего IP-адрес для экземпляра TPU, я также попытался передать зону и имя проекта с теми же результатами
Exception in thread Thread-5:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/cluster_resolver/tpu_cluster_resolver.py", line 476, in _fetch_cloud_tpu_metadata
return request.execute()
File "/usr/local/lib/python3.5/dist-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
return wrapped(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/googleapiclient/http.py", line 856, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://tpu.googleapis.com/v1/projects/None/locations/None/nodes/xxxxxx:8470?alt=json returned "Permission denied on resource project None.". Details: "[{'links': [{'url': 'https://console.developers.google.com/project/None/apiui/credential', 'description': 'Google developer console API key'}], '@type': 'type.googleapis.com/google.rpc.Help'}]">
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/tpu/preempted_hook.py", line 87, in run
response = self._cluster._fetch_cloud_tpu_metadata() # pylint: disable=protected-access
File "/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/cluster_resolver/tpu_cluster_resolver.py", line 480, in _fetch_cloud_tpu_metadata
"constructor. Exception: %s" % (self._tpu, e))
ValueError: Could not lookup TPU metadata from name 'b'xxxxxxxx:8470''. Please doublecheck the tpu argument in the TPUClusterResolver constructor. Exception: <HttpError 403 when requesting https://tpu.googleapis.com/v1/projects/None/locations/None/nodes/xxxxxx:8470?alt=json returned "Permission denied on resource project None.". Details: "[{'links': [{'url': 'https://console.developers.google.com/project/None/apiui/credential', 'description': 'Google developer console API key'}], '@type': 'type.googleapis.com/google.rpc.Help'}]">