§ `ƒiˆ.ã ó<—ddlZddlmZddlmZmZmZddlZddlm Z ddl mZmZddl mZmZddlmZmZmZmZdd lmZmZmZd dlmZerddlmZdd lmZdddee ee fde!de def d„Z"Gd„de¦«Z#eGd„de¦«¦«Z$dgZ%dS)éN)ÚIterable)Ú TYPE_CHECKINGÚOptionalÚUnioné)ÚBatchFeature)ÚBaseImageProcessorFastÚDefaultFastImageProcessorKwargs)Úgroup_images_by_shapeÚreorder_images)ÚIMAGENET_STANDARD_MEANÚIMAGENET_STANDARD_STDÚPILImageResamplingÚSizeDict)Ú TensorTypeÚauto_docstringÚrequires_backendsé)ÚBeitImageProcessorFast)ÚDepthEstimatorOutput)Ú functionalÚinput_imageútorch.TensorÚoutput_sizeÚkeep_aspect_ratioÚmultipleÚreturncó—dd„}|jdd…\}}|\}}||z} ||z} |r+td| z ¦«td| z ¦«kr| } n| } || |z|¬¦«}|| |z|¬¦«}t||¬¦«S)Nrcó´—t||z¦«|z}| ||krtj||z¦«|z}||krtj||z¦«|z}|S©N)ÚroundÚmathÚfloorÚceil)ÚvalrÚmin_valÚmax_valÚxs úw/home/jaya/work/projects/VOICE-AGENT/VIET/agent-env/lib/python3.11/site-packages/transformers/models/dpt/modular_dpt.pyÚconstrain_to_multiple_ofz>get_resize_output_image_size..constrain_to_multiple_of4sd€Ý#˜‘.Ñ!Ô! HÑ,ˆàÐ 1 w¢; ;Ý” ˜3 ™>Ñ*Ô*¨XÑ5ˆAàˆwŠ;ˆ;Ý” ˜# ™.Ñ)Ô)¨HÑ4ˆAàˆóéþÿÿÿé)r©ÚheightÚwidth)rN)ÚshapeÚabsr) rrrrr*Úinput_heightÚinput_widthÚ output_heightÚoutput_widthÚscale_heightÚscale_widthÚ new_heightÚ new_widths r)Úget_resize_output_image_sizer;.sÐ€ð ð ð ð ð!,Ô 1°"°#°#Ô 6Ñ€L+Ø"-Ñ€M<ð! <Ñ/€LØ Ñ,€Kàð'åˆq;‰ÑÔ¥# a¨,Ñ&6Ñ"7Ô"7Ò7Ð7à&ˆLˆLð'ˆKà)Ð)¨,¸Ñ*EÐPXÐYÑYÔY€JØ(Ð(¨°{Ñ)BÈXÐVÑVÔV€Iå˜:¨YÐ7Ñ7Ô7Ð7r+cól—eZdZUdZeeed<eeed<eeed<eeed<dS)ÚDPTFastImageProcessorKwargsa ensure_multiple_of (`int`, *optional*, defaults to 1): If `do_resize` is `True`, the image is resized to a size that is a multiple of this value. Can be overridden by `ensure_multiple_of` in `preprocess`. size_divisor (`int`, *optional*): If `do_pad` is `True`, pads the image dimensions to be divisible by this value. This was introduced in the DINOv2 paper, which uses the model in combination with DPT. keep_aspect_ratio (`bool`, *optional*, defaults to `False`): If `True`, the image is resized to the largest possible size such that the aspect ratio is preserved. Can be overridden by `keep_aspect_ratio` in `preprocess`. do_reduce_labels (`bool`, *optional*, defaults to `self.do_reduce_labels`): Whether or not to reduce all label values of segmentation maps by 1. Usually used for datasets where 0 is used for background, and background itself is not included in all classes of a dataset (e.g. ADE20k). The background label will be replaced by 255. Úensure_multiple_ofÚsize_divisorrÚdo_reduce_labelsN)Ú__name__Ú __module__Ú__qualname__Ú__doc__rÚintÚ__annotations__Úbool©r+r)r=r=Us^€€€€€€ððð ! œ Ð%Ð%Ñ%Ø˜3”-ÐÐÑØ ”~Ð%Ð%Ñ%Ø˜t”nÐ$Ð$Ñ$Ð$Ð$r+r=c'ó&—eZdZejZeZeZ dddœZ dZdZdZ dZdZdZdZdZdZdZdZeZ d&dd d ededd edeededd fd„Z d'dd dedd fd„Zded deded ededdedededededeeeeefdeeeeefdedeededeedeed eee e!fde"f&d!„Z# d(d"d#d$eee!ee$eefdfdee%e e!ffd%„Z&dS))ÚDPTImageProcessorFasti€r.TFgp?r-NÚimagerÚsizeÚ interpolationzF.InterpolationModeÚ antialiasr>rrcóÖ—|jr|js$td| ¦«›¦«‚t ||j|jf||¬¦«}tj|||||¬¦«S)a< Resize an image to `(size["height"], size["width"])`. Args: image (`torch.Tensor`): Image to resize. size (`SizeDict`): Dictionary in the format `{"height": int, "width": int}` specifying the size of the output image. interpolation (`InterpolationMode`, *optional*, defaults to `InterpolationMode.BILINEAR`): `InterpolationMode` filter to use when resizing the image e.g. `InterpolationMode.BICUBIC`. antialias (`bool`, *optional*, defaults to `True`): Whether to use antialiasing when resizing the image ensure_multiple_of (`int`, *optional*): If `do_resize` is `True`, the image is resized to a size that is a multiple of this value keep_aspect_ratio (`bool`, *optional*, defaults to `False`): If `True`, and `do_resize` is `True`, the image is resized to the largest possible size such that the aspect ratio is preserved. Returns: `torch.Tensor`: The resized image. zDThe size dictionary must contain the keys 'height' and 'width'. Got )rrr)rMrN)r/r0Ú ValueErrorÚkeysr;r Úresize)ÚselfrKrLrMrNr>rrs r)rRzDPTImageProcessorFast.resize€s€ð:Œ{ð s $¤*ð sÝÐqÐdh×dmÒdmÑdoÔdoÐqÐqÑrÔrÐrå2ØØœ d¤jÐ1Ø/Ø'ð ñ ô ˆõ&Ô,Ø%˜°MÈYð ñ ô ð r+r?cóž—|jdd…\}}d„}|||¦«\}}|||¦«\}} ||| |f} tj|| ¦«S)a„ Center pad a batch of images to be a multiple of `size_divisor`. Args: image (`torch.Tensor`): Image to pad. Can be a batch of images of dimensions (N, C, H, W) or a single image of dimensions (C, H, W). size_divisor (`int`): The width and height of the image will be padded to a multiple of this number. r,Ncó\—tj||z¦«|z}||z }|dz}||z }||fS)Nr)r"r$)rLr?Únew_sizeÚpad_sizeÚ pad_size_leftÚpad_size_rights r)Ú_get_padz1DPTImageProcessorFast.pad_image.._get_padºsB€Ý”y ¨Ñ!4Ñ5Ô5¸ÑDˆHØ $‘ˆHØ$¨™MˆMØ%¨ Ñ5ˆNØ .Ð0Ð0r+)r1ÚFÚpad)rSrKr?r/r0rZÚpad_topÚ pad_bottomÚpad_leftÚ pad_rightÚpaddings r)Ú pad_imagezDPTImageProcessorFast.pad_imageªsu€ðœ B C CÔ(‰ ˆð 1ð 1ð 1ð'˜h v¨|Ñ<Ô<ÑˆØ&˜h u¨lÑ;Ô;Ñˆ)Ø˜W i°Ð<ˆÝŒuU˜GÑ$Ô$Ð$r+Úimagesr@Ú do_resizeÚdo_center_cropÚ crop_sizeÚ do_rescaleÚrescale_factorÚdo_normalizeÚ image_meanÚ image_stdÚdo_padÚdisable_groupingÚreturn_tensorsc óh—|r| |¦«}t||¬¦«\}}i}| ¦«D]&\}}|r| ||||| ¬¦«}|||<Œ't ||¦«}t||¬¦«\}}i}| ¦«D]T\}}|r| ||¦«}|r| ||¦«}| ||| | ||¦«}|||<ŒUt ||¦«}|rtj |d¬¦«n|}td|i¬¦«S)N)rm)rKrLrMr>rr)ÚdimÚpixel_values)Údata)Úreduce_labelrÚitemsrRrÚcenter_croprbÚrescale_and_normalizeÚtorchÚstackr)rSrcr@rdrLrMrerfrgrhrirjrkrr>rlr?rmrnÚkwargsÚgrouped_imagesÚgrouped_images_indexÚresized_images_groupedr1Ústacked_imagesÚresized_imagesÚprocessed_images_groupedÚprocessed_imagess r)Ú_preprocessz!DPTImageProcessorFast._preprocessÆsª€ð,ð /Ø×&Ò& vÑ.Ô.ˆFõ0EÀVÐ^nÐ/oÑ/oÔ/oÑ,ˆÐ,Ø!#ÐØ%3×%9Ò%9Ñ%;Ô%;ð ;ð ;Ñ!ˆE>Øð Ø!%§¢Ø(ØØ"/Ø'9Ø&7ð"-ñ"ô"ð-;Ð" 5Ñ)Ð)Ý'Ð(>Ð@TÑUÔUˆõ0EÀ^ÐfvÐ/wÑ/wÔ/wÑ,ˆÐ,Ø#%Ð Ø%3×%9Ò%9Ñ%;Ô%;ð =ð =Ñ!ˆE>Øð MØ!%×!1Ò!1°.À)Ñ!LÔ!LØð NØ!%§¢°ÀÑ!MÔ!Mà!×7Ò7Ø ¨N¸LÈ*ÐV_ñôˆNð/=Ð$ UÑ+Ð+å)Ð*BÐDXÑYÔYÐØCQÐg5œ;Ð'7¸QÐ?Ñ?Ô?Ð?ÐWgÐÝ .Ð2BÐ!CÐDÑDÔDÐDr+ÚoutputsrÚtarget_sizescóæ—t|d¦«|j}|/t|¦«t|¦«krtd¦«‚g}|€dgt|¦«zn|}t ||¦«D]~\}}|`t jj | d¦« d¦«|dd¬¦« ¦«}| d |i¦«Œ|S) aÊ Converts the raw output of [`DepthEstimatorOutput`] into final depth predictions and depth PIL images. Only supports PyTorch. Args: outputs ([`DepthEstimatorOutput`]): Raw outputs of the model. target_sizes (`TensorType` or `List[Tuple[int, int]]`, *optional*): Tensor of shape `(batch_size, 2)` or list of tuples (`Tuple[int, int]`) containing the target size (height, width) of each image in the batch. If left to None, predictions will not be resized. Returns: `List[Dict[str, TensorType]]`: A list of dictionaries of tensors representing the processed depth predictions. rwNz]Make sure that you pass in as many target sizes as the batch dimension of the predicted depthrr-ÚbicubicF)rLÚmodeÚ align_cornersÚpredicted_depth)rrˆÚlenrPÚziprwÚnnrÚinterpolateÚ unsqueezeÚsqueezeÚappend)rSr‚rƒrˆÚresultsÚdepthÚtarget_sizes r)Úpost_process_depth_estimationz3DPTImageProcessorFast.post_process_depth_estimations€õ( ˜$ Ñ(Ô(Ð(à!Ô1ˆàÐ$3¨Ñ+?Ô+?Å3À|ÑCTÔCTÒ+TÐ+TÝØoñôð ðˆØ8DÐ8L˜v¥ OÑ 4Ô 4Ñ4Ð4ÐR^ˆÝ"% o°|Ñ"DÔ"Dð 7ð 7ÑˆE;ØÐ&ÝœÔ+×7Ò7Ø—O’O AÑ&Ô&×0Ò0°Ñ3Ô3¸+ÈIÐejð8ñôç’'‘)”)ðð NŠNÐ-¨uÐ5Ñ6Ô6Ð6Ð6àˆr+)NTr-F)r-r )'rArBrCrÚBICUBICÚresampler rjrrkrLrdrgrirlrhr>rr@rfrer=Úvalid_kwargsrrrGrErRrbÚlistÚfloatrÚstrrrrÚtupleÚdictr“rHr+r)rJrJls¾€€€€€à!Ô)€HØ'€JØ%€IØ CÐ(Ð(€DØ€IØ€JØ€LØ €FØ€NØÐØÐØÐØ€IØ€NØÐà.€Lð:>ØØ,-Ø"'ð( ð( àð( ðð( ð Ð 5Ô6ð ( ð ð( ð% SœMð ( ð ð( ð ð( ð( ð( ð( ðZð%ð%àð%ðð%ð ð %ð%ð%ð%ð89Eà^Ô$ð9Eðð9Eðð 9Eð ð9Eð Ð 5Ô6ð 9Eðð9Eðð9Eðð9Eðð9Eðð9Eð˜U 5¨$¨u¬+Ð#5Ô6Ô7ð9Eð˜E %¨¨e¬Ð"4Ô5Ô6ð9Eð ð9Eð% SœMð9Eð ð!9Eð"˜s”mð#9Eð$# 4œ.ð%9Eð&! s¨J Ô!7Ô8ð'9Eð* ð+9Eð9Eð9Eð9Eð|RVð'ð'à'ð'ð˜u Z°°e¸CÀ¸H´oÔ1FÈÐ%LÔMÔNð'ð ˆd3˜ ?Ô#Ô $ð 'ð'ð'ð'ð'ð'r+rJ)&r"Úcollections.abcrÚtypingrrrrwÚimage_processing_baserÚimage_processing_utils_fastr r Úimage_transformsrrÚimage_utilsr rrrÚutilsrrrÚbeit.image_processing_beit_fastrÚmodeling_outputsrÚtorchvision.transforms.v2rr[rErGr;r=rJÚ__all__rHr+r)úr§sðð"€€€Ø$Ð$Ð$Ð$Ð$Ð$Ø1Ð1Ð1Ð1Ð1Ð1Ð1Ð1Ð1Ð1à€€€à1Ð1Ð1Ð1Ð1Ð1ØbÐbÐbÐbÐbÐbÐbÐbØEÐEÐEÐEÐEÐEÐEÐEððððððððððððððððððððððð EÐDÐDÐDÐDÐDðð9Ø8Ð8Ð8Ð8Ð8Ð8à5Ð5Ð5Ð5Ð5Ð5ð$8Øð$8às˜H SœMÐ)Ô*ð$8ðð$8ðð $8ð ð$8ð$8ð$8ð$8ðN%ð%ð%ð%ð%Ð"Añ%ô%ð%ð.ð{ð{ð{ð{ð{Ð2ñ{ô{ñ„ð{ð|#Ð #€€€r+